I have 2 large folders that are not uploading to odrive.
One folder is 111GBs large and the other is 211.5GBs large. When I move smaller files it begins uploading normally.
This issue is that I cannot uploading these two folders in batches or divide the contents and upload them in sections. One folder in particular has 2,607 Folders, and 1,684,461 Files. I noticed that Windows crawls when handling folders spread out like this… so moving these folders around is not an option since Windows is literally not capable of doing that. The time to divide these files is too much also, not worth it with how many folders there are.
I believe this is an odrive stability issue.
Do you know how many files/folders, total, you are dealing with? You mentioned that its at least ~1.7 million, but it sounds like that was just one folder.
It may very well be hitting a scaling limit as odrive tries to process all of this data. As you said, even the OS is having trouble dealing with it.
You could try performing the moves from the command line, to mitigate the Windows Explorer performance problems. That should allow the operations to complete without the UI/Explorer overhead.
Well I guess it would be good to focus on one folder for now, and when I just try it with that one folder (that has 2,607 folders) it still does nothing.
I guess, what I am wishing for is to see if odrive can be improved to handle this number of folders. I know it can handle large single files… but it seems to look at this folder like, “I’m not going to even try”.
I think the best process for odrive would be to upload one folder at a time no matter how time consuming it would be, at least it’s doing something. Moving the entire folder isn’t really much of the issue, but more getting odrive to recognize that it needs to operate on the folder.
Do you see any activity for the odrive process (CPU or network), given the size of the dataset, it may just be taking a really long time to scan, index, and then start the upload.
It’s doing nothing. I just got done uploading one folder that was 111GBs in size, and had 2,630 files in it. I must make a note though that it didn’t do it perfectly. odrive got hung up several times and just froze / was a crashed state by being unable to do anything with it.
A restart was the only option to get odrive to work as odrive would not re-open when I closed it. I tried numerous times to re-open it.
So now that I have that one folder sync’d, I tried the 2nd folder which is the largest. When I move it over, odrive seems to still do nothing.
Here is a current process view of it just sitting with a lot of memory taken up…
Once it goes into this state, odrive is inoperable and does not listen to any other folder. It’s basically in a crashed state as long as I have that large folder there.
odrive is not scanning or indexing anything as far as I can tell. I literally just hovered over the odrive icon as I type this and it disappeared. I can’t re-open it as that doesn’t work without a reboot.
The scope of data (overall) may be too much for odrive at this point. Do you know how many objects (total files and folders) are within odrive’s view now? I would like to get a good sense of how many objects odrive is actively tracking.
Are you able to unsync anything to reduce some of that scope?
The total amount of data is over 500GBs, and the folder count is maybe 50-100. Total file count is at maybe 50,000. Hard to determine at the moment, without viewing properties of many sync’d folders.
This is disappointing that I’ve hit odrive’s limit.
The ironic thing is that these two folders I’m moving over into one folder is a dedicated folder to later unsync whatever I moved into it.
As for unsyncing other things, that would defeat the purpose of the product for me. I don’t want to unsync to later have to re-download other stuff.
I feel like I’m the only one with these issues because I use odrive differently, I imagine most customers don’t flood odrive with a single large folder. Most users probably incrementally create more data.
50K objects shouldn’t be a problem, actually.
You had mentioned ~1.7 million in a previous post, so I was thinking that this was included in the scope. Is that 1.7 million file folder the one that you are trying to upload now?
So the 1.7 million file folder is the one I am trying to upload…
But I was able to upload the 111GBs folder (by restarting my system when odrive would hang / become unresponsive).
The current scope of odrive is 8,632 folders, and 56,660 files at this moment now. I came up with the “50,000” number out of now where but I was surprisingly close.
I can send you a copy of this folder for you to test it out (I can find a non-odrive way of uploading). It’s just an Instagram scraping project.
On my own systems I have over 100k objects, so ~65000 objects shouldn’t be much of an issue. The 1.7 million object folder can pose a problem, though. Since odrive now has that folder in its scope (the one you are trying to upload), this counts against the total object tracking.
To upload that 1.7M folder, you are going to need to push it in incrementally. Do you think that will be possible?
Objects at scale is something we address in the new code we are trying to get out. Of course, saying that doesn’t do much good until we get it out, but I wanted to note that it is something we are working towards. There will always be detriments to large data sets though. There is no way to get around the additional overhead of adding more and more objects that need to be tracked and synced in real-time.
Objects at scale is something we address in the new code we are trying to get out.
I really hope this is true. It gives me hope. I realize that real-time sync is very hard for thousands of files because odrive has to keep eyes on thousands of objects all at once… which in reality is a resource intensive difficulty… but I imagine that whether it is CPU or Memory overhead… as long as those are always increased that odrive should keep up. What I am saying is that I would hope as odrive continues to improve that CPU and RAM Memory become the bottlenecks instead of odrive. Right now odrive seems to be the bottleneck.
Could you guys also consider adding it some method of strict uploading? That way odrive would be able to better handle a massive folder / file scenario like this just to get something pushed into the cloud…
I feel like odrive can handle this large of data, but there there is something that doesn’t allow it to pick up or work on the file set a little at a time. Perhaps there is a way to make odrive scan a folder, and if it exceeds 1,000 folders it will handle the sync and process in a very different manner? Almost like analyzing how to process it so the job eventually gets done. I can push it incrementally but I presume I’ll have to find some program that allows me to divide and grab sections of these folders at a time.
Thanks for the input @christianorpinell.
The new sync engine does things a bit differently and is built with scaling in mind. Not only will it perform better, but there will be lots of feedback so that you can better tell what is happening.