Hi @voarsh,
Disk usage will definitely increase as you go because of the local tracking databases and logging. Since you have a lot of data I would expect the tracking database to grow fairly large.
The I/O activity could also be due to logging and sync database activity (reads/writes during the bulk download process).
The bulk processing you are doing is definitely going to need resources since it is pretty intensive, especially with the higher-performance script running to do multi-threaded folder expands and downloads. If possible, I would make sure you have a decent amount of disk and CPU available to go through the 8TB of data.
What are you running this on and how constrained are the resources?
Just an update on my sync.
It’s going slowly indeed, from the high I/O usage and grinding my machine to a halt, restarts, weird behaviour. I’ve had to use 1/2 processes.
I’m now experiencing an unusual thing where I needed to reinstall the odrive agent, and when I run a sync in the same directory I am seeing:
Unable to sync init.pyc.cloud. This file is not inside an odrive folder.
but for everything…
I re-mounted ("$HOME/.odrive-agent/bin/odrive" mount “/shared/odrive” /) , just in case and sync ("$HOME/.odrive-agent/bin/odrive" sync “/shared/odrive/Google Drive” --recursive):
and get:
Unable to sync Finale Fireworks.mp3.cloud. This file doesn’t exist.
for loads and loads of files. Should I be concerned?
Hi @voarsh,
If you reinstalled odrive you will need to let odrive scan through the entire mount to re-index everything. This could take quite a while, unfortunately. Until odrive is able to index the structure it won’t “know” about it, so you may see errors like this.
If you are concerned about any files in particular, you can verify them via the odrive web client or the Google web client.
So, I suppose that it is finding files that are in the folder, but not on the remote?
I don’t know why I needed to re-install it - it just wouldn’t let me run odriveagent, it didn’t think it was running.
Thanks, I guess I’ll just continue running it over night. I think it’s only done 2 TB’s so far, I’ve actually delete a portion of data it downloaded that was unnecessary as well. I have a feeling it’ll take another 1/2 months - even though my internet speed is 3-5 MB/s - the client/system instability is delaying me.
The errors you posted previously indicated lack of drive space, which could also cause local tracking database corruption if odrive is unable to properly write to the database files. It is possible it hit something like that that caused an issue with it running. It is hard to tell without seeing any logs or output.
Can you tell me a bit about what you are running on and what the resources dedicated to it look like?
A large-scale bulk operation like this will need a good amount of resources to run comfortably, which means you’ll want to be generous with disk space, CPU, and memory allocation, however that is being done.
Can you be more targeted in your download commands to avoid unnecessary sections?
The LXC container has 6 GB’s disk space free, I haven’t seen it max out, the external drive has 11 TB’s free. There’s no space issue.
Debian flavoured host, with a Turnkey Core flavour container, 7 GB’s dedicated RAM to this instance, USB 2 (external drive), unfortunately.
It has been obvious to me that CPU has never been an issue:
(container)
The container is hosted by PROXMOX (Debian) and the host is passing the mounted drive to the LXC container. Raising the odrive process count by 4 thrashes the I/O. Not sure if it is USB 2 issue, PROXMOX or the something else. But I had to lower it.
I had initially tried it on USB 3, and the CPU on that host was slightly less and it took a beating so I opted for the one I am using, which only has USB 2 - so I don’t think that’s the cause.
I did have that thought. Eventually all will need to be downloaded. I suppose what I should have done is have a clean up of non-essential stuff before syncing…
Hey @voarsh,
I had a thought about the high I/O. The CLI is packaged in a way where it needs to “unpack” when it is executed. When you are using it in a scripted fashion it could raise I/O when it is called over and over again. This hasn’t been an issue on the systems I have tested on, but it could be impacting your system more than usual.
To avoid this you can use the pure python CLI version. You would just need to have python installed on the system to do this. This would avoid the binary executable unpacking and may alleviate some of the I/O pressure you are seeing. To be clear, this is only for the CLI client, the odriveagent is continuously running, so is not affected in this way.
I had another system crash, which wasn’t entirely related to oDrive and I had to forcibly turn off the computer. Talking about a load of over 200, completely out…
Running oDrive gets me:
root@nextcloud ~# “$HOME/.odrive-agent/bin/odrive” > /dev/null 2>&1
root@nextcloud ~# “$HOME/.odrive-agent/bin/odrive” status
There was an error sending the command, please make sure odrive agent or desktop is running.
root@nextcloud ~# python “$HOME/.odrive-agent/bin/odrive” status
File “/root/.odrive-agent/bin/odrive”, line 1
SyntaxError: Non-ASCII character ‘\x97’ in file /root/.odrive-agent/bin/odrive on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
Really annoying, because the only way I know to sort this is to re-download/install odrive, resetting the DB… which will take over 6 hours to re-scan.
exec 6>&1;num_procs=3;output=“go”; while [ “$output” ]; do output=$(find “/shared/odrive/Google Drive” -name “.cloud” -print0 | xargs -0 -n 1 -P $num_procs python “$HOME/.odrive-agent/bin/odrive” sync | tee /dev/fd/6); done
So, like this?
— it doesn’t seem to like “python” included"
Or:
python “$HOME/.odrive-agent/bin/odrive.py” mount “/shared/odrive” /
Yeah you will want to target the odrive.py file instead of the odrive binary. So: exec 6>&1;num_procs=4;output="go"; while [ "$output" ]; do output=$(find "/shared/odrive/Google Drive" -name "*.cloud*" -print0 | xargs -0 -n 1 -P $num_procs python "$HOME/.odrive-agent/bin/odrive.py" sync | tee /dev/fd/6); done
Make sure you are running odriveagent instead of odrive to run the agent process. If that is still not working, then try running it without piping the output to null. So just: $HOME/.odrive-agent/bin/odriveagent
Hey @voarsh,
It looks like the errors you were seeing may be related to Google issues, is that correct? The log files can also shed more light on the errors that might be seen.
For thee Google Docs files, I would probably need to see the logs, or even the diagnostics if Google is returning an unexpected error.
The next time you see that, can you send a diagnostic (diagnostics command)?
Hi @voarsh,
You don’t have to stop sync. You can just run the diagnostics command from the CLI. You would just need either another ssh session to the server, or use screen to manage multiple terminal sessions within a single window (https://www.howtogeek.com/662422/how-to-use-linuxs-screen-command/).
Hi Tony.
I’ve downloaded 6 TB’s (roughly) - not sure if it’s done exactly, but for the last day or two it seems to be cycling through the files it’s not able to download, like Google Docs/Sheets, “malware” flagged files and folder paths that are illegal:
It seems to be cycling through these download failures, forever, and not finishing downloading the rest of the files. Or, it has finished and it just cycles through the failed files.
Yes, this is correct. The download script is dumb, so it will just continue to cycling on files it is unable to download, forever.
For the flagged files, did you try the build that allows you to override the flag error?
For the google documents, since they can’t really be “downloaded” anyway, you can just filter them out in the script command like this: exec 6>&1;num_procs=4;output="go"; while [ "$output" ]; do output=$(find "/shared/odrive/Google Drive" -name "*.cloud*" ! -path "*.gdocx*" ! -path "*.gsheetx*" -print0 | xargs -0 -n 1 -P $num_procs python "$HOME/.odrive-agent/bin/odrive.py" sync | tee /dev/fd/6); done
Unable to sync S…Wish.flv.cloud. This file doesn’t exist.
Not sure what this error is - it won’t let me sync a folder with files that clearly do exist?
–
I had the idea to try on Windows and seems to be giving more success.
Due to issues of (unknown reason) not being able to download these folders with videos inside, there’s still hundreds of GB’s to download.
Regarding Sheets/Docs, do you suggest downloading from the web interface, by searching for file type and mass downloading?
Hi @voarsh,
Once you get the “This file doesn’t exist” message again, can you send a diagnostic from the command line? (The diagnostics command), so I can take a closer look?
For the Google Doc files, yes you would need to download from the Google Drive web client so that it converts them to Excel, Word, etc…
Hi @voarsh,
I hope you are having a good holiday, as well!
Were you able to send a diagnostic after hitting these errors? I should be able to take a look at that and get a better idea of what is causing the errors. If you are able to, please reproduce the errors you are seeing (“This file doesn’t exist” and “Error updating the folder. Error accessing file”) and then send a diagnostic over.
What does ~/.odrive-agent/log/main.log show when you hit these errors?