Treat Arxiv PDF Link Same as Landing Page Link

Depending on who sent me an Arxiv paper, I sometimes get a link to the Arxiv PDF directly. Other times, I get the paper’s high level Arxiv landing page. As an example:

Landing Page:
https://arxiv.org/abs/2006.16241
Paper:
https://arxiv.org/abs/2006.16241.pdf

I use the Paperpile extension in Chrome to add papers to my Paperpile account. Right now, depending on which Arxiv page I am on and click “Add to Paperpile”, I get different entries. You can see that with the above paper. As an example, if I click “Add to Paperpile” for the landing page in Chrome, all is swell and dandy. In contrast, if I click “Add to Paperpile” in Chrome on the PDF, the extension resolves the citation to a Github link in the paper.

Without fully understanding all the constraints, the way I would handle this is someone clicked “Add to Paperpile” on the PDF in Arxiv, I would add the landing page information (just strip the “.pdf” off the link). That way the behavior is consistent and returns better results.

I assume this idea applies to other domains. I know it applies to NeurIPS papers on their website. It probably applies to Biorxiv as well (I am just guessing).

1 Like

Here is what I get when I try to add the paper from the PDF first. If you have added the document already to your account, you get different behavior so try adding the PDF first. You may also need to try it on a fresh paper.

Thanks for the input, @ZaydHammoudeh. I believe this is related to the way the information is coded by the publisher rather than our tool’s ability to parse it. In any case, I’ve consulted with the team and will let you know when I hear back.

Thanks @vicente for the quick reply. I get your point on this. However, for organized, popular sites like Arxiv, I think the most accurate citation information will always be at the abstract link not embedded in the PDF.

For curated sites (e.g., Springer, Nature, bioRxiv, NeurIPS, etc.), I think having the citation analyzer check the domain first and (if applicable) redirect to the superior, standardized citation source will almost certainly lead to better, more predictable, more accurate outcomes.