I just bit the bullet and upgraded to premium as part of my “get away from amazon stuff” quest.
I copied what used to be on my Amazon Drive into the directory I setup for B2 syncing. It’s about 17GB. I have gigabit fiber at home. I started this last night, and today it was only about 15% done. The list of “waiting” files was like 1,000 files, with a note it was “auto-retrying”.
I quit odrive and started it again and now it’s moving at a more respectable 240 Mb/s to 650 Mb/s.
Is there anything I can do when it slows down other than to quit and restart odrive?
edit: this is what the “waiting files” section looked like. Also if it works, lots of client website files in here, so tons of tiny php/css/js files.
Hi @amazon12,
With all of those file waiting to retry, it seems like the client was hitting some exceptions when trying to upload.
If you are still seeing a large list of waiting changes, can you send a diagnostic from the odrive menu so I can take a closer look at what the issue might be?
B2 has upload endpoints that can get “too busy” and will then start returning 503’s back to us. We can handle these by grabbing a new upload endpoint, so I’ll look at quickly getting that change into the integration to deal with this situation.
The restart will reinitialize the B2 connection, which will grab a different upload endpoint. That is why you are seeing better results once you restart odrive.
Hi @amazon12,
I took a closer look at the integration, and we are actually already retrieving new URLs. The diagnostic shows that lots of things are uploading, but there are files that will inevitably hit this 503 error from B2. The appearance of better behavior after a restart is probably more of an illusion, now that I’ve dug into it more.
Since you have so many small files there are a lot of upload attempts happening, so you will start to see the retry queue fill up with files that were rejected for 503s. Those will eventually retry. While they are in the queue, odrive will still process and upload other files.
Despite having a connection capable of very fast uploads, you are not going to see great transfer rates while uploading all of these little files. The transaction overhead for a B2 upload is not insignificant, and odrive will only upload a max of 4 files at a time, by default (this may be editable soon in our forthcoming advanced config files). The same four threads are also responsible for adding new folders, which has its own overhead (it looks like there are already ~22,000 folders that odrive was tracking at the time of the diagnostic.)
In summary, what you are seeing is expected with B2. odrive is requesting new URLs, but there are so many small files to upload that you are going to end up seeing lots of items hit the “waiting” queue, and transfer speed is going to look pretty slow until you hit some bigger files.
I see that there may be a couple of optimizations we can make in the integration, so I will look into that. If we can expose the concurrent upload setting (not sure about this one yet), you can also try cranking that up to help speed.
Thanks Tony - the thing that made me think the restarts were speeding things up is I’d see a large spike in upload bandwidth. Like now it’s working on a folder, 30KB/s maybe tops, there’s a queue of 372 items. CPU usage is low.
I restart odrive and there’s a CPU spike that lasts awhile, perhaps just odrive doing some sanity checks on all the remote services, and a network spike that seems to last a half hour or so.
As noted in the odrive2 forum, you are correct, my work involves a zillion tiny files…
Hi @amazon12,
I think the spike may be due to the way odrive tries to prioritize uploads.
It will try to upload smaller files first, and then larger files, once it has enough files in the queue to sort. When odrive initially starts, it will start uploading as quickly as possible, so the sorting hasn’t taken place yet. It may be hitting larger files during this initial window, which would give you better performance. I also expect that you’ll start to see better speeds as odrive makes its way though all the small stuff.
Our newer versions of odrive allow for tweaking concurrency.
There are two config files in the root of the odrive folder. If you open the odrive_user_premium_conf.txt file you can change the value of maxConcurrentUploads from 4 to a higher number. You can experiment with that to see what might work best for you. Once you save the change, restart odrive for it to take effect. It should provide you with better performance than you are seeing now.
Hi @amazon12,
You may not see much difference with only 8. The stats you listed put the average file size at around 16KB. Even at 8 concurrent, you are only going to be seeing ~250KB/sec, if we estimate a .5 sec roundtrip for a single upload (this is probably an optimistic estimate, too).
Try bumping it to 20 (or even 30) and make sure the bandwidth throttling setting in the odrive menu is set to “unlimited” for upload.
Once you’ve saved the new setting in the config file and restarted odrive, can you let it run for a bit and then send a diagnostic so I can take a look?
Hi @amazon12,
I’m glad those settings helped to get everything up a little quicker.
The diagnostic shows you have about 850,000 objects to track, so the overhead will be fairly substantial. odrive should be able to deal with it, but it would be best to unsync folders that you can consider “archived” and don’t need quick access to. Are you able to unsync portions to reduce the scope?
Additionally, there is a setting in the odrive menu to turn off periodic background scans. It is at the top under “Ready to sync new changes” and called “Disable background scanning”. With this setting enabled odrive will still upload any local changes, but it won’t interrogate the storage for remote changes unless you are navigating into those folders (which will then kick off an on-demand remote query for that folder). This setting needs to be enabled anytime odrive is started.
I do wish OS-X had the ability to “peek” into a zip like Windows, that would totally solve this problem for me (would be a nice odrive feature!).
What’s the best way to deal with zipping the most obnoxious archives up to minimize B2 transactions? Stop odrive and do my work then restart or just let it run while I’m zipping and then deleting the origin directories?
Hi @amazon12,
If you use the unsync capability on a folder (Manage Disk Space) you can collapse it and remove its structure from odrive’s view. This may depend on if you are able to cordon off particular sections, so that you can unsync them without removing your access to areas to need immediate access to.
Zipping to an archive outside odrive and then moving that in should be fine. If you zip inside odrive, it will constantly interrogate that file to see if it can sync it, which can produce some general overhead. Try to keep the zip files to a reasonable size to facilitate consistent upload and eventual download.
Hi @amazon12,
Before taking any other actions, please read all of the information below and let me know if you have any questions about anything.
It looks like the two zip files you are trying to upload may be hitting an error on B2 when uploading. Instead of zipping them up and uploading those to B2, I think a better option is to just unsync those folders on the the local side. The cost will actually be less than converting things into zip files, since you’ve already uploaded that data, and you will be able to keep those items as individual files and folders so you can retrieve specific items, if/when needed.
The “sync” cost is really going to come down to the amount of data you have “exposed” locally. This means folders that have not been unsynced (converted into placeholders) yet.
My recommendation is to do the following:
Do not sync the local deletes (the items in the odrive Trash bin) to B2.
Restore the folders in the odrive trash that you zipped up (from within the “Trash bin” submenu you can click on the individual items to restore them or click on “Restore all trashed items” to recover them all. This cancels the pending deletes so that they are never synced to the cloud. When you restore from trash, the items are restored as placeholders (unsynced).
It the future, for the folders that you do not need local access to and want to “archive”, right-click on those folders and select unsync. This will leave them as they are in the remote storage, but turn them into placeholder files locally (.cloudf).
Remove the large zip files you created. They will not be needed since you already have that data remotely.
Important: This plan will not work if you have already sent your deletes to B2 (clicked on “Empty trash and sync all deletes” in the “Trash bin” submenu). Please verify this so we know how to proceed.
I definitely already sent my deletes, I assumed that’s what’s hanging…
I’m totally OK with not having the unzipped stuff anywhere. It’s archival, and if I need to poke in the zip files I’m OK with unpacking them outside of an odrive-watched directory…
I get the idea of having some stuff only in B2, but I really like the idea of the data existing in at least two places (my desktop machine + cloud).
Hi @amazon12,
Are you seeing an error when clicking on the “Empty trash and Sync all deletes” option, or is it just not doing anything once you do that?
I need to have the team look at why your two files are not uploading, can you send one more diagnostic after retrying the empty trash option?
There is no immediate error, but after a very long time I’ll get an error about a single directory. If it pops up again, I’ll record it and post here. I just hit the “empty trash and sync all deletes” button and sent a diagnostic.
Hi @amazon12,
I think the issue here is that deletes on B2 can be very slow, due to the nature of the storage. Since there aren’t really any folders, odrive has to delete every single one of the files, so the more files you have, to longer it can take and the more often it can run into errors.
If you want, you can run a CLI script that will continue to retry the trash emptying, even if it hits several errors.
We also found an issue with larger file upload (over 5GB in size), which is preventing your two large files from uploading. We should be able to release a new version to address that next week.