Bad characters in pdf filenames cause Google Drive sync errors

A couple of months ago, I began to have issues with Google Drive sync errors - recurrent errors like: “google drive needs to quit - an unknown issue occurred - code 2281” - Google support was singularly unhelpful in solving the problem, but I did find a solution eventually after first trying a host of bad suggestions from Google Support or people on the Goggle Drive Help Forum. By scrutinizing the Google Drive sync log file, I noticed that it called out the path to a file that it tried to sync before the application quit. It was a pdf in one of my Paperpile folders. The file had unsupported characters in the filename indicated by black diamonds with question marks inside. I deleted that file from Paperpile, from the Paperpile trash folder, from my Google Drive, from the corresponding online drive folder, and from the Google Drive folders on each of my other computers that were being synced. All was fine after that. That was a couple of months ago. Then, just a few days ago, I started getting the recurring sync error again. The log file showed the telltale error: “Exiting due to critical failure: TypeError(‘must be encoded string without NULL bytes, not unicode’,)”, but I couldn’t find a path to a bad file. I checked my online folder and to my surprise found the same bad file that I had cleaned out and removed successfully more than a month ago, now present once again in my Paperpile “Trashed Papers” directory. How strange. I don’t know if the file was restored by Google Drive or by Paperpile after being gone for more than a month. Very odd, and a huge waste of time. I deleted all the entries in my trashed papers directory (again) and that seems to have solved the problem. Would it be possible for Paperpile to reject files with those unsupported characters instead of cataloging them? Does paperpile introduce those characters? I know that others have had similar problems, since I read a forum post from someone else who also identified a paperpile pdf file with the same characters as the culprit that caused GD syncing to fail.

Hi Doug, thank you for reporting this. This is definitely a bug we will fix. Could you perhaps give us the title of the paper in question? It is likely that the bad characters were there.

I am not sure how the paper reappeared in your trashed papers directory. It is theoretically possible that it was still in the Chrome cache on one of your computers and was reuploaded when performing a cleanup and resync. One of my colleagues may have more insight on this matter.

We don’t introduce these characters, they usually come from external sources. We already aggressively filter any characters that don’t make sense. So we would need to know which you have seen, otherwise it’s hard to fix.

Also that’s clearly a bug in the Google Drive client. If Google Drive allows us to upload a file with a given name (and probably perfectly correct unicode characters) and it can’t sync them to the hard drive that’s just broken.

Now that I deleted the file, I’m afraid I can’t tell you what it was. As I recall, it included a bunch of Scandinavian names with unusual characters. If I am able to find it again I will reply here. Originally I thought it was due to a GD update, since I started having the problem after (regrettably) responding to a message that told my I was using an unsupported drive version and must update the software. Perhaps that’s when the already uploaded file became unsyncable.

Is there any chance a fix will be released for this soon? It was a very tedious process to clean up the offending files, one of which had the title:
“BIOLOGICALRESEARCHHASREACHEDAPO… - ASMIN&ISHER 4HOMAS!(ENZINGER.pdf”, however I don’t think the bad characters were carried across here.

To help with other users, here are instructions to manually delete the bad files:
-) Quit Google Drive
-) Locate the ‘sync_log.log’ file for Google Drive. (/Users/username/Library/Application Support/Google/Drive/user_default/sync_log.log)
-) Open it, search for ‘paperpile’ and scroll to the bottom of the file.
-) The offending paper will be one of the most recent entries in the log file. Locate the .pdf with the strange looking name.
-) Search for and delete the paper in THREE places. (1) the local paperpile papers folder, (2) online at your Paperpile database, (3) online at Google Drive. For (2) and (3) don’t just delete the paper. Empty the trash also.
-) Open Google Drive app again. The same error may re-appear with a new offending paper title. Repeat the process until all offending papers have been thoroughly deleted.

Have you reported the problem to Google? If a single file that is correctly saved in Google Drive by a third party App via their official API can crash their client that’s a major bug.

I reported the error in the Google Drive Help Forum.
https://productforums.google.com/forum/#!topic/drive/B-D7XFcWVGg;context-place=forum/drive

Thanks. On which operating system and file system are you? Just to be clear, it’s perfectly fine on a Mac to use ( ) ! and according to https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#naming_conventions also on Windows.

I’m on Mac. I think maybe it was replacing the characters with (U+FFFD � ). If the problem resurfaces I’ll try to hunt down the actual bad characters.