ModelAngelo is an automatic atomic model building program for cryo-EM maps.

[[https://github.com/3dem/model-angelo]]

===== with IGBMC HPC =====

  * Upload the protein (and DNA/RNA) sequences as separate fasta files, as well as the map.
  * Check the hand of the map and flip it in ChimeraX (<quote>''volume flip #1''</quote>) if necessary
  * Log in to the hpc <quote>''ssh <login>@hpc.igbmc.fr''</quote>
  * Edit the slurm submission script (with <quote>''nano''</quote> for instance): 

<code>
#!/bin/bash

################################ Slurm options #################################
### Job name
#SBATCH --job-name=model_angelo
### Limit run time "1-00:00:00"
#SBATCH --time=10:00:00
### Requirements
#SBATCH --partition=lamour-ruff #(or gpu)
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=40GB
#SBATCH --gres=gpu:1
### Email
#SBATCH --mail-user=username@igbmc.fr
#SBATCH --mail-type=ALL
### Output
#SBATCH --output=/shared/mendel/projects/xxx/cryoem/model_angelo/model-angelo-%j.out
################################################################################

echo '########################################'
echo 'Date:' $(date --iso-8601=seconds)
echo 'User:' $USER
echo 'Host:' $HOSTNAME
echo 'Job Name:' $SLURM_JOB_NAME
echo 'Job Id:' $SLURM_JOB_ID
echo 'Directory:' $(pwd)
echo '########################################'
# modules loading
module load model-angelo/1.0.1
# acces database
export TORCH_HOME=/shared/genomes/model_angelo_weights/
#Job command
model_angelo build -v map.mrc -pf prot.fasta -df dna.fasta -rf rna.fasta -o output
echo 'Done.'
echo '########################################'
echo 'Job finished' $(date --iso-8601=seconds)
</code>

Run the script: 
<code>sbatch model_angelo_script.sh</code>

You will get one email when the job start and a second one when its complete (or crashed). You can also follow it with <quote>''squeue''</quote>

===== on POLLUX computer =====

  * Upload the protein (and DNA/RNA) sequences as separate fasta files, as well as the map.
  * Check the hand of the map and flip it in ChimeraX (<quote>''volume flip #1''</quote>) if necessary

=== Setup ===

<code>
ssh jmwadmin@pollux

bash

conda activate model_angelo
</code>


=== Building a map with FASTA sequence ===

This is the recommended use case, when you have access to a medium-high resolution cryo-EM map (resolutions exceeding 4 Å) as well as a FASTA file with all of your protein sequences.

Let's say the map's name is <quote>''map.mrc''</quote> and the sequence file is <quote>''sequence.fasta''</quote>. To build your model in a directory named <quote>''output''</quote>, you run:

<code>model_angelo build -v map.mrc -pf sequence.fasta -o output</code>

If you would like to build nucleotides as well, you need to provide the RNA and DNA portions of your sequences in different files like so

<code>model_angelo build -v map.mrc -pf prot.fasta -df dna.fasta -rf rna.fasta -o output</code>

If you only have RNA or DNA, you can drop the other input.

If the output of the program halts before the completion of <quote>''GNN model refinement, round 3 / 3''</quote>, there was a bug that you can see in <quote>''output/model_angelo.log''</quote>. Otherwise, you can find your model in <quote>''output/output.cif''</quote>. The name of the mmCIF file is based on the output folder name, so if you specify, for example, <quote>''-o testing/test/model_building''</quote>, the model will be in <quote>''testing/test/model_building/model_building.cif''</quote>.


=== Building a map with no FASTA sequence ===

If you have a sample where you do not know all of the protein sequences that occur in the map, you can run <quote>''model_angelo build_no_seq''</quote> instead. This version of the program uses a network that was not trained with input sequences, nor does it do post-processing on the built map.

Instead, in addition to a built model, it provides you with HMM profile files that you can use to search a database such as UniRef with HHblits.

You run this command:

<quote>''model_angelo build_no_seq -v map.mrc -o output''</quote>

The model will be in <quote>''output/output.cif''</quote> as before. Now there are also HMM profiles for each chain in HHsearch's format here: <quote>''output/hmm_profiles''</quote>. To do a sequence search for chain A (for example), you should first install HHblits and download one of the databases. Then, you can run

<quote>''hhblits -i output/hmm_profiles/A.hhm -d PATH_TO_DB -o A.hhr -oa3m A.a3m -M first''</quote>

You will have your result as a multiple sequence alignment here: <quote>''A.a3m''</quote>.