bioRxiv not importing properly

papers in bioRxiv are incorrectly being imported with the source “Cold Spring Harbor Laboratory”


Thanks for the report, @plg. I was able to reproduce with every paper imported from bioRxiv - not sure if this is a problem on our end of theirs, but the team will let us know soon enough. I’m afraid my only suggestion at the moment is manually correcting the source, but hopefully this won’t take too long to rectify. I’ll reach out with an update when I have one.


it had been working up until about 1 week ago

We’ve come to the conclusion that this is an intended change. You’ve probably noticed by now the CSH logo next to bioRxiv’s; since that institution is now operating the site it makes sense they should appear as the publisher in the metadata. An excerpt from their About section:

Screenshot 2020-10-27 at 11.10.09

There’s not much else to say about this, I’m afraid. You can always edit the metadata by hand, but as far as we can tell the publisher is now CSH and not bioRxiv.

Perhaps I mis-spoke … it’s the “source” not the publisher that has changed to cold spring harbor laboratory… source, as in journal name.

So for example:

The journal Current Biology is published by Cell Press… but when I import a paper from Current Biology, the paper is identified as a “journal article” and the “journal” field is correctly imported as Current Biology, not as “Cell Press”.

For bioRxiv papers, I see that they are imported not as journal articles but as “preprint manuscript”, and there is no field called “journal”, just a field called “source”, which gets populated with “Cold Spring Harbor Laboratory”. But I would argue that the preprint should show source as “bioRxiv”, which is the equivalent of the journal name (like Current Biology). Cold Spring Harbor Laboratory is the equivalent of the publisher (like Cell Press)

Apologies, @plg, and thanks for the analogy. I was the one who misspoke - clearly meant source instead of publisher since we’re talking about preprints.

I understand the distinction and don’t really see a reason for the sudden change, but the team checked and there are no signs to indicate it was not intentional. Not far-fetched either since bioRxiv is hosted by CSHL. In any case, I’m afraid this is beyond our scope so I have no further insight to offer.

Hi, I also find this problematic - if I saw “Cold Spring Harbor Laboratory” in the bibliography I wouldn’t guess it’s a bioRxiv paper. Unfortunatelly this is exactly how bioRxiv papers show up in the bibliography now.
I understand that Cold Spring Harbor Laboratory runs bioRxiv now, but when you go into “citation tools” of bioRxiv papers they are not cited as “Cold Spring Harbor Laboratory” but as “bioRxiv”, so I would consider this paperpile change to be a bug.


Also, the bibTex file clearly states “bioRxiv” in the journal field and “Cold Spring Harbor Laboratory” in the publisher.

Hi Vicente and Paperpile team, I fully agree with everyone here that the way Paperpile currently annotates and, more importantly, cites preprints from bioRxiv is not correct. The organization hosting the server is Cold Spring Harbor Laboratory. However, the ‘source’ should be bioRxiv. This is how it is indexed in Pubmed and this is how preprints are cited in journals. I hope Paperpile will keep it consistent with the convention and not force the users to change it every time manually. And thanks a lot for the great software tool! I could not live without it now. With best wishes, Eugene Valkov (NIH/NCI).

1 Like

We have fixed the issue with bioRxiv import and we hope to roll out a new extension release later this week.


Thank you! Very much appreciated