Incorrect auto-correction of MIS Quarterly journal name


#1

Hi,

When adding MIS Quarterly journal articles, the journal name is always getting auto-corrected and imported as “MISS Q.” and “The Mississippi quarterly” as full name of the journal. This auto-correction occurs on both, directly on the journal page and on Google Scholar, although, e.g, on Google Scholar the bibtex item seems correct:

    @article{<foo>,
      title={<title>},
      author={<author>},
      journal={MIS Quarterly},
      year={2016}
   }

Is there any way I can prevent paperpile from this kind of incorrect auto-correction?


#2

That’s clearly a false positive in our journal name normalization procedure. There is nothing you can do on your side I’m afraid. But it should be an easy fix and try to get it into the next update.


#3

Hi Stefan,

I complained about this more than a year ago, and at the time you gave me the same response. This is by the way not the only journal whose name is auto-corrected mistakenly. Isn’t there a more systematic way to let you know about such issues?

Mahmood


#4

We don’t get many complaints about false positive journal corrections. Actually it’s extremely rare. As far as I remember @andreas fixed all the issues that were reported. The best way to report these issues is to use the in-app messenger.


#5

Not intending to unnerve you, but maybe there are not many in the fields you and the majority of your users know better about. I can tell you I have seen a few, and they are all among the best known business/management journals.

Here is another one: Academy of Management Journal. See the automatically detected full journal name.

Here is yet another one: Academy of Management Review.

All these I have to fix manually every time, and if I hit auto-update by mistake, the same happens again.


#6

Maybe you don’t get many complaints but there is also no way to immediately react to false positives, e.g., an undo button or just a way to classify the preceded normalization procedure as false positive.

Users have to report false positives by reporting the issue here or within the app using natural language to describe the issue. The alternative for users is to correct the names manually. This is what I did for a year until I reported the MIS Quarterly issue in March this year (see above). While manually correcting names is a quick fix in such situations (btw. I’m still doing this on a daily basis), it nevertheless shapes the overall user experience.

The number of users, which adjust the journal name after the procedure is finished, might be an appropriate measure to estimate the number of false positives.


#7

Of course we would like to have 100% sensitivity and 100% specificity with our journal name normalization. Unfortunately this is not possible.

That does not mean we don’t want to improve this feature. Actually I talked to @andreas about this particular false positive and it turns out he is working on an updated journal list which should make easier to tweak the behaviour of journal name auto-correction.

And when I say we get “few complaints” that was already factoring in that many users most likely don’t report problems here. At this point I think we can reliably estimate the impact of certain problems from our support request data. And in the big picture auto-correction seems to works really well.


#8

@stefan, I understand your developer perspective, and I agree that in the “big picture” the recognition engine works pretty well. But please note the user side: The fact that Paperpile misidentifies Academy of Management Journals (the four of them) and MIS Quarterly, translates for me into 30% miss rate, because those are the main journals in my field. So, you may see it as overall 99% hit rate, but for me it is one paper over three mishandled and requiring manual editing. I think it is important to keep this in mind as you update the engine. There must be a way to systematically use the user feedback on this.


#9

It is very hard to get a reliable list of journals names and their abbreviations. The most credible source out there is the list maintained by the National Institute of Health (NIH). We imported their data, and for the “Academy of management review” the list the full title as “Academy of management review. Academy of Management”, see here:
https://www.ncbi.nlm.nih.gov/nlmcatalog/9877758

We are working on an update and hope to fix it soon.


#10

I don’t think there is any single authoritative source with clean data out there, so any single source (or combination of sources) will have its own shortcomings. But I wonder why you don’t implement a simple interface that with our (users’) permission, aggregates our corrections, creating a crowdsourced input stream for you as well. It is, afterwards, easy to use this to make triggers when there are way too many corrections on an item.


#11

Hi @andreas and @stefan.

I would like to let you know how frustrated I am, as a paid user, that after 14 months these problems persist and still both MIS Quarterly and Academy of Management journals are mis-labeled in the metadata. I understand that there is no single authoritative source, but the minimum you can do for your users is to take their feedback into account and fix the obvious sources of error. While we are waiting for your “Grand Solution” to tackle all metadata problems in the academic world, maybe you can patch these few frustrating instances? Maybe you can develop a suggestion system for your users? Maybe you can use the user-entered corrections with an algorithm that signals to you the needed corrections in your database. But do something please. How many years should a user wait for you to fix a label, even if you have the perfect alibis not to do it?


#12

@stefan and @andreas

Really guys, what an amazing customer support you have. 16 months and counting. Isn’t it “soon” yet? At least add an option to allow us to do batch fixes. Your software mis-labels almost all the most important publications in my field. I am starting to wonder whether I am working for the software not the other way around.


#13

I’m sorry about that. There was some misunderstanding between me and Andreas what the course of action was and who will follow up on this thread. My understanding was that we can fix the few pathological cases on a one-by-one-basis (which may have not been easily possible if it’s not done by now).
I’ll follow up on that.


#14

It’s good to hear back from you finally. The news is that some of the journals that used to be mis-labeled before are now being mis-labeled differently since a few days ago. For instance Academy of Management Review used to be labeled as “Academy of Management Review. Academy of Management”. Now it is being labeled as AMRO. In our field we do call this journal AMR, but not AMRO. I don’t know where that name comes from.

Ultimately, the only good way to fix these metadata issues is by letting the users help you as they identify the false positives and correct the wrong labels. None of the databases, unfortunately, is the ultimate reference on that, and none of the academic publishers cares nearly enough to provide a solution for that. They have easy money and an unbeatable oligopoly. You can just give back power to the people, using a simple learning algorithm and a part-time human moderator.


#15

The issue still persists. I uploaded a perfectly ‘clean’ list of references, with no mistakes, from my Endnote library. Paperpile completely messed up a whole bunch of them. There MUST be a better way of dealing with this issue than what we’ve seen here from you over the past year or so. Can you at least allow users to switch off the auto correct function?!