====== Moving project data from Rocket to Comet ====== At the end of 2025, as Rocket is retired, projects may be transferred to Comet using the HPC Portal, but **this will not transfer any data** from Rocket to Comet. All data transfer is the responsibility of the project owner and members. * Rocket, Comet and Research Data Warehouse (RDW) are housed in the same data, with very fast network connections between their filesystems. * `rsync` is is the best command for transferring large data (use `rsync` instead of `scp` and `cp`) * Graphical file transfer tools can be used on a workstation or laptop but these will be **//much slower//** ===== Rsync for transfers between Rocket, Comet and RDW on Linux Command Line  ===== ''rsync'' is faster than ''cp'' and ''scp'' because it uses a smart delta-transfer algorithm to only send changed file parts, not entire files, making updates extremely quick; it also offers better compression, efficient directory handling, and the ability to resume transfers, while scp does a full file copy every time, making it simpler but less efficient for syncing existing data. Also see our [[https://hpc.researchcomputing.ncl.ac.uk/dokuwiki/doku.php?id=faq:rdw_transfer_how-to&s[]=transfer|File Transfers and RDW]] page Basic syntax: ''rysnc -a [--options] source destination'' ==== Transfer files direct from Rocket to Comet ==== We think ALL projects should have an [[faq:rdw_setup|RDW share]]. Over the past months you have been advised to back up your Rocket data to RDW. If your data is on Research Data Warehouse (RDW), there is no rush; it can be downloaded efficiently to Comet as required because RDW is mounted (available) on Comet as /rdw. //However,// * NOW that Rocket is closing soon any data not already backed up to RDW should be transferred direct to Comet, to save time. * If you have ephemeral data on Rocket that doesn't need to be retained after use, it should be transferred direct to Comet * Note: On Rocket, project directory names are the **same as the project code.**, like ''aprj'' On Comet, project directory names are prefaced with their origins like ''**comet_**aprj'' or ''**rocket_**aprj'') === On a Comet login node === * Log in to Rocket and check the path to the data you want to copy on rocket (go to the directory and type ''pwd'' to get the path). e.g. ''/nobackup/projectcode/MyData/'' * Log in to Comet * ''cd'' to your new project directory * Run the rsync command, like: ''**rsync -a //userid//@rocket.hpc.ncl.ac.uk:/nobackup/proj///projectcode/MyData// .///fromRocket//**'' (//italic text// must be replaced with your own user ID and directory names) $ pwd /nobackup/proj/rocket_code $ rsync -a --stats user@rocket.hpc.ncl.ac.uk:/nobackup/proj/myproject/MyData ./fromRocket If your transfer takes a long time or you can't easily check the result using ''ls'' * these options will show what is happening and make the transfer faster: ''--itemize-changes --inplace --whole-file --size-only'' * ''--dry-run'' shows what //would// happen if you ran the command, but doesn't copy any files. Add this to the end of your command to check, then remove it and run the command 'for real' * re-run the rsync command after completion. If all went well, no files will be copied on the re-run. (OSS 2410.0) [user@cometlogin02(comet) comet_training]$ rsync -a --itemize-changes --inplace --whole-file --size-only --stats user@rocket.hpc.ncl.ac.uk:/nobackup/proj/myproject/MyData ./fromRocket **Permissions related errors** If you attempt to copy files to which you do not have read access, 'rysnc' will show an error like: ''rsync: send_files failed to open "/nobackup/proj/jshpcu/bonnie_rocket_64g.txt": Permission denied (13)'' This usually happens when someone other than you created the file. Ask the file owner to [[faq:009|fix permissions]] and if this isn't possible, [[:contact:index|contact the RSE-HPC team]] for help. ====== Transfer from your campus workstation to Comet ====== ===Command line Linux or Mac: === ''scp'' for is just for small data transfers but ''rsync'' is fine for all transfers to Comet == on campus == For example, on your campus workstation, you have a directory named 'forComet' in your home directory $ pwd /home/user rsync -az --stats ./forComet user@comet.ncl.ac.uk == off campus == Transferring data from off-campus is to be avoided where possible, because speed will be poor. Rsync is the best option when this must be done, because it allows proxy connections and resuming after failures. First check that you can ssh to the proxy server. Setting up ssh keys on the proxy (unix.ncl.ac.uk) can be helpful. pwd /home/user/myHPCdata rsync -az -e "ssh user@unix.ncl.ac.uk" ./forComet user@comet.ncl.ac.uk === Graphical options on Linux, Mac and Windows: === * FileZilla https://wiki.filezilla-project.org/Using is a friendly graphical interface for linux or Windows.  It's available in software center on campus Windows machines. * WinSCP https://winscp.net/eng/docs/getting_started is available in software center on campus Windows machines * FreeFileSync https://freefilesync.org/tutorials.php is an application for syncing files and directories, that you can install with admin rights, or request as a one-off install by NUIT NB the free versions of these applications will only work //on campus// as they do not support using a gateway (proxy).  Connect to wired LAN rather than campus WiFi to improve transfer speeds. ====== Transfer files from RDW to Comet ====== RDW will always be mounted on Comet. If your data is on RDW, there is no need to copy it to Comet until you need to work with it. Use ''kinit'' to refresh your login before running a long transfer. Your permissions to access RDW expire after an hour. Try out a dry run:  [user@cometlogin01(comet) ~] rsync --dry-run –rltv --inplace --itemize-changes --progress --stats --whole-file --size-only /rdw/path/to/my/share/source/ /nobackup/myproj/destination Run ‘for real’:  [user@cometlogin01(comet) ~] kinit [user@cometlogin01(comet) ~] rsync -rltv --inplace --itemize-changes --progress --stats --whole-file --size-only /rdw/path/to/my/share/source/ /nobackup/myproj/destination ------------------------- [[started:data_transfer|Back to Data Transfer]]