Sync generates duplicates

My migration from Amazon Drive to OneDrive haven’t gone smooth… My next problem is bizarre. After I set up a sync, I saw a lot of “Not allowed”. Every file and folder that wasn’t allowed seemed to have ÅÄÖ in them. Upon closer inspection, I noticed the files had been duplicated (exactly) and wasn’t allowed to be synced because they already existed in that location.

Skärmbild_20230201_211504

What’s going on here? This appears to be unrealted to my other problems with backups entirely.
I sent a diagnostic.

Regards
/Staffan

Hi @staffan,
These files look like they have the same name, but they are actually different.

Let’s take this file name, for example:
Scannat Ängsullsv…pdf

The above name looks the same as this file name:
Scannat Ängsullsv…pdf

However the 1st one uses unicode character U+00C4 (Latin Capital Letter A with Diaeresis) and the second one uses an “A” with unicode character U+0308 (Combining Diaeresis)

OneDrive must be doing some conversion on their side that makes these filename appear to be the same, even though they are different. If they were exactly the same Windows wouldn’t allow you to have two of them.

Maybe OneDrive isn’t accepting one of the characters and is converting it. However, I wonder why the old one is still there. It’s pretty annoying. It’s hundreds of files and folders that are being duplicated, and Odrive has to be involved in making/keeping them I think? The appropriate action would be to keep the newer files, I suppose. Otherwise they would just be converted again.
Do you know of a safe and fast way of removing the duplicates?

Thanks, Tony, your support is excellent!

Hi @staffan,
From the logs I was looking at, it looked like these duplicates already existed by the time odrive encountered them. You can see in this progression that odrive sees the new data to upload, creates the parent folder, and starts uploading the files inside:

01 Feb 06:51:28PM INFO Successful Create Folder (Local to Remote) for D:/odrive/Luckycat/Documents/Angsullsvagen/Dokument

01 Feb 06:51:30PM INFO Successful Upload File for Item: D:/odrive/Luckycat/Documents/Angsullsvagen/Dokument/Scannat Ängsullsv..pdf - Size: 1622690 Bytes - Date: 2019-05-12 22:00:35

01 Feb 06:51:31PM ERROR Failed Upload File for Item: D:/odrive/Luckycat/Documents/Angsullsvagen/Dokument/Scannat Ängsullsv..pdf - Size: 1622690 Bytes - Date: 2019-05-12 22:00:35 - Error: code OD_PATH_ALREADY_EXISTS caused by OneDriveRequestException(code OD_PATH_ALREADY_EXISTS - {"error":{"code":"nameAlreadyExists","message":"An item with the same name already exists under the parent"}})

There are no downloads from OneDrive and odrive will never create any files on its own.

I also tested using the same file name in a few different scenarios and did not encounter any instances where a copy of the file was automatically created with the different unicode character. I was able to reproduce the OneDrive issue where Microsoft is treating these two different names the same, though, so they must be normalizing the name somehow.

  • Did these files come from Amazon Drive?
  • If so, are you able to see if they exist there too?

For removing the duplicates:
odrive should be tracking these in the “not allowed” list as it hits them. You can get a full listing of this by clicking on “Detailed odrive status” in the odrive tray menu.

Once you have the list, you could sanitize it to just the file paths and use a bash or powershell script to move the files to another location or delete them.

You’re right. These files were created earlier, I just noticed now because of the move. The duplicates are on Amazon Drive also. Maybe it has to do with Windows/Mac and Amazons ability to sence the difference in character encoding?

I’ll probably never understand exactly how these dublicates got made, but thanks for looking into it in depth. I really appreciate it.

1 Like