Specify custom BibTeX key formats for entire library

It has been an eternity since this issue was first reported, and an eternity since anyone has commented on it. Any update on whether/when this might get implemented? I’m a recent convert from Mendeley, and sorely miss being able to control how citation keys get formed.

1 Like

Thank you for reviving this topic. I realize it’s been a long time but it seems we haven’t been able to prioritize this topic. As we continue developing in several directions, we are approaching a point where we’ll be able to focus on improvements and implementation of features. This one will certainly be among them.

1 Like

As Vicente said, that has been on the backburner for quite some time but we have not forgotten about it. There are quite a few other features related to BibTeX/LaTeX (like automatically exporting BibTeX files, Overleaf integration) which is on our roadmap.

1 Like

Here is a pretty simple solution that might work – don’t treat the keys as uuid for your database. Instead, generate new keys using the user’s template string if one is specified, at the time when you output the bib entries for each item.

If multiple entries are output, just dedupe within that list of publications. If the keys overlap with other entries the user already have in their LaTeX project, that is their problem not yours.

The citation key format I have been using is the Google Scholar one: kaelbling1993learning. where learning is the first non-trivial work of the title. This dedupes the keys enough that I never have problem with conflicts. Note that it is also all lower-case, which is a nice thing for your template engine to accommodate.

`${firstAuthor.lastName.lower}${year}${title.firstWord.lower}`
2 Likes

I’m not sure what the best approach to this problem is, but my issue is that BibTeX keys can differ for the same reference if I export it at another time. For example, when I started writing my thesis, a certain reference was exported as Oeppen2002-sf but now I am redrafting and the same reference is exported as Oeppen2002-zd.
The earlier comment about whether Smith99b becomes Smith99a if b gets deleted: please don’t make them change automatically. (Maybe it could be a manual option that someone could change if they wanted to.)
Now I am making a new Paperpile folder for the references I am definitely using in my thesis + new references added as I redraft, but I am going to have to mix new exported references with copied old exported citations in order to avoid problems with references in the text that I already wrote a couple years ago.

We are getting serious about this feature now and hope to have something to beta test soon.

In the meantime, we have put together a small survey to better understand the use cases and workflows for LaTeX/BibTeX users:

Thanks.
I would like to add my request to include INSPIRE citation keys as user specified export option. This is well thought and widely used system – https://inspirehep.net/bibliography-generator

thanks,
Rahul

1 Like

Now in September 2020. Is there any update on this topic?

Welcome to our forum, @Erika_Kawakami! A whole feature to allow customization of citekeys is in the works, along with several other novelties which we’re hoping to release soon. I don’t have an ETA to share just yet, but there should be some updates on our newsletter this fall :crossed_fingers:t4:

Hi! +1 on this.
Paperpile is a terrific application and I am looking forward to transitioning to it as my only reference manager.
Unfortunately, the way bibtex keys are managed is a showstopper for me. I have a single bibtex database with more than 2K papers in it. Some of them form the basis of my research and I reference them multiple times across multiple documents. I typically just type in the keys that I know by heart. I would have not a real problem to follow a different scheme (as SurnameYYYY), but the fact that the keys use non-deterministic suffix makes the writing process a lot cumbersome.
Currently I am using paperpile to add documents, organize pdfs, read them and so on. But to support my writing I am still bound to Mendeley, for the specific way bibtex support is implemented. It’s a pain to continuously migrate between the two, especially because Paperpile general feeling is much better.

I have seen that improved bibtex support is currently being worked on. Do we have an ETA for the public beta of this feature?

Thanks for the amazing work.

1 Like

Welcome to our forum, @ilpelle, and thanks for the lovely feedback. Our release schedule has been quite delayed so there is no timeline to share for this yet, unfortunately. We’ve been working on a complete rewrite of our extension and web app; once that is out we’ll move on to implement this and other planned features posted on our roadmap.

1 Like

Hello,

I’ve started to use paperpile a few months ago and it’s already been quite useful for me. Thanks for the great app, it has definitely lots of potential!

I would like to +1 the issue with bibtex and INSPIREHEP database though. If only I could use INSPIRE bibtex entries and keys, I would be able to make paperpile my main bibliography management tool and I’d really love to do that! As of now this is kind of a show-stopper for me.

+1 for allowing some choice of citekeys

I’d advocate using the Google Scholar document ID, because it stays the same as a paper moves from working paper to preprint to final published version (the same may be true of the previous commenter’s suggestion of INSPIREHEP, but Google Scholar applies to all scholarly disciplines, not just high energy physics). So if a paper you cite changes its status after you create its citekey, all you need to do is click the (awesome) “update” button and you’ll get the new citation when you recompile the paper with the updated exported bibtex file.

An example is

https://scholar.google.com/scholar?cluster=2961583112046108454&hl=en&as_sdt=0,21&scilib=1

Of course, it would be horrific to have to type cite{2961583112046108454} throughout your paper, but there’s a simple workaround.

In latex you could have as many “aliases” as you like for the same object. So, if you’ve cited the same paper in two papers of your own, once as MooreNewmanEpi and once as mnPerc. You could have a file paperpile_aliases.sty whose content was:

\newcommand{\MooreNewmanEpi}{2961583112046108454}
\newcommand{\mnPerc}{2961583112046108454}

and then just:

  1. Add usepackage{paperpile_aliases to the beginning of your latex document
  2. Search and replace \cite{mnPerc to \cite{\mnPerc in one paper, and \cite{MooreNewmanEpi with \cite{\MoreNewmanEpi in the other.

This approach would be particularly useful for those of us who have one giant master bib file that we use for all our papers. You could make a corresponding paperpile_aliases.sty file giving all of the alternative aliases for the same bib reference.

PS. Is there any ETA for the release of whatever you’re doing on this?

Welcome to our forum, @llorracc! Thanks for the feedback – I’ve added it to our tracker for the team’s consideration. The upcoming renovations of our web app and extension (which will include a new citekey editing tool) should be up and running before the end of the year. In case you haven’t already, check out our roadmap to see what else is in store.

+1 for this feature. The current solution seems to tag on a unique 2-letter ID for each bibtex key, but as far as I can tell, this is arbitrary so makes it difficult to know which paper is which for \cite auto-complete.

Thanks Paperpile team!

1 Like

Any news on this? I have loved all the updating and classifying options so far, but the thought of not being able to control the citation keys is a bit of a nightmare — considering I have about 1.5k of those!

How about using a general scheme like AuthorYearFirstwordsoftitle-twocharacteridentifier, so e.g. for a paper from Larsen in 2022 with the title “Automatic discover of blabla”, the key would be Larsen2022automatic-xy for the whole database? In this way, one can easier locate the correct reference when citing it, but the previous identifier can be used as well to make it easy for the algorithm to generate unique bibtex keys?

It is really highly needed to include the title, in particular when you work in the asian community where there are usually 10s of paper with the same first author name and year in the same database.

I can report that the ability to customize BibTeX keys is being currently actively worked on by the team, as we know this is an important missing feature. In the meantime, a possible workaround for small libraries (albeit laborious if you have got a large library) is that you can manually set your own citation keys. Select a reference in your library, click the Edit button to open the Edit Details dialog, select Additional Fields, scroll down to Identifiers, select BibTeX key and you’ll find that you can set the BibTeX citation key to whatever you’d like. If you decide to set this field, make sure it is unique in your library. Click Save and Next to change the next citation key.

1 Like

OK, I’ve just verified that the strategy above works (though it is a bit tedious).

I’m glad to see that Suzanne’s message below says the team is working on some solution. It’s a very tricky problem, though. The problem is that no combination of author, name, year, etc is guaranteed to uniquely identify a particular paper (though it will usually work). Furthermore, for scholars working at the frontier of research, many of the papers being cited will be working papers that do not have a doi or purl or a year of publication or a journal yet.

I’d vote that a good solution would be to allow the user to choose a method of unique identification of a paper, and to use that ID as the method to determine whether two papers with the same citekey are truly duplicates. If they have different UUID’s, they’re two different papers by Smith in the same journal in the same year (say). For my purposes, I’d choose the Google Scholar cluster ID, because then I can keep the reference to a given paper the same as the paper moves from working paper to preprint to forthcoming to published. That way, whenever I recompile one of my own papers, it will automatically update all the references to the latest info in my master paperpile bib.

OK, here’s a simpler proposal.

Instead of coming up with your own random letters at the end of the citekey, why don’t you choose (or allow the user to choose) say the last three characters in the Google collection identifier for the versions of the paper? Those will remain unique once Google has found the paper, even across revisions. So, a citekey generator that involves first author, first word from title, and last 3 from google collection would be both reasonably human-friendly and almost always unique.