====== Dbcan Installation ======
[[https://github.com/linnabrown/run_dbcan|dbCAN]] is the standalone version of the dbCAN3 annotation tool for automated CAZyme annotation. This tool, known as run_dbcan, incorporates HMMER, Diamond, and dbCAN_sub for annotating CAZyme families, and integrates Cazyme Gene Clusters (CGCs) and substrate predictions.
> Note: With thanks to Becky Durr, using [[https://dbcan.readthedocs.io/en/latest/installation.html]])
===== Create and activate a conda environment for dbcan =====
First log in to Rocket. Start in your home directory.
conda create -n dbcan python=3.8
conda activate dbcan
===== Install dbcan package =====
conda install -c bioconda dbcan
Verify installation
run_dbcan -h
==== Install BLAST+ and verify installation ====
//(still inside dbcan environment)//
conda install -c bioconda blast
blastp -version
===== Set up the database =====
In home directory, install the databases you wish to work with
cd ~
test -d db || mkdir db
cd db \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/fam-substrate-mapping-08012023.tsv && mv fam-substrate-mapping-08012023.tsv fam-substrate-mapping.tsv \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/PUL.faa && makeblastdb -in PUL.faa -dbtype prot \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/dbCAN-PUL_12-12-2023.xlsx && mv dbCAN-PUL_12-12-2023.xlsx dbCAN-PUL.xlsx \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/dbCAN-PUL.tar.gz && tar xvf dbCAN-PUL.tar.gz && rm dbCAN-PUL.tar.gz \
&& wget https://bcb.unl.edu/dbCAN2/download/Databases/dbCAN_sub.hmm && hmmpress dbCAN_sub.hmm \
&& wget https://bcb.unl.edu/dbCAN2/download/Databases/V12/CAZyDB.07262023.fa && mv CAZyDB.07262023.fa CAZyDB.fa && diamond makedb --in CAZyDB.fa -d CAZy \
&& wget https://bcb.unl.edu/dbCAN2/download/Databases/V12/dbCAN-HMMdb-V12.txt && mv dbCAN-HMMdb-V12.txt dbCAN.txt && hmmpress dbCAN.txt \
&& wget https://bcb.unl.edu/dbCAN2/download/Databases/V12/tcdb.fa && diamond makedb --in tcdb.fa -d tcdb \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/V12/tf-1.hmm && hmmpress tf-1.hmm \
&& wget http://bcb.unl.edu/dbCAN2/download/Databases/V12/tf-2.hmm && hmmpress tf-2.hmm \
&& wget https://bcb.unl.edu/dbCAN2/download/Databases/V12/stp.hmm && hmmpress stp.hmm \
&& cd ../ && wget http://bcb.unl.edu/dbCAN2/download/Samples/EscheriaColiK12MG1655.fna \
&& wget http://bcb.unl.edu/dbCAN2/download/Samples/EscheriaColiK12MG1655.faa \
&& wget http://bcb.unl.edu/dbCAN2/download/Samples/EscheriaColiK12MG1655.gff
NB The example databases above are being downloaded from the web with ''wget''. Regularly required data should be stored on and copied from Research data warehouse (mounted under ''/rdw''), which has a fast connection to Rocket, //saving time you and freeing Rocket resources for other users//.
===== Run dbcan =====
run_dbcan [arguments like filename]