Here is some trips and tricks based on our calculations with Linux, Gromacs and Plumed. You may find it useful in your research (but it is just made for my personal convenience)
1. How to generate stripped PDB with deleted ANISOU
grep -v ANISOU my-pdb-with-ANISOU.pdb > my-pdb-without-ANISOU.pdb
2. Position restraint procedure for C-alphas
gmx make_ndx -f confout.part0001.gro -o index.ndx
gmx genrestr -f confout.part0001.gro -n index.ndx -fc 500 500 500 -o posreca.itp
3. Trajectory handling
Processing trajectory
#If you have jump problem during visualisation
gmx trjcat -f traj_comp.part*.xtc -o trajall.xtc
echo 1 | gmx trjconv -s ~/pfs/HIV/ref.tpr -f trajall.xtc -pbc nojump -o trajnoj.xtc
echo 3 1 | gmx trjconv -s ~/pfs/HIV/ref.tpr -f trajnoj.xtc -fit rot+trans -o trajfit.xtc
gmx trjcat -f trajfit.xtc -dt 10 -o tfit.xtc
#If there is problem with periodic boundary conditions in Gromacs smulation
trjcat -f traj*.part*.xtc -o trajall.xtc
echo 1 | trjconv -s ../gpwtest.tpr -f trajall.xtc -pbc mol -o trajmol.xtc
echo 3 1 | trjconv -s ../gpwtest.tpr -f trajmol.xtc -fit rot+trans -o trajfit.xtc
trjcat -f trajfit.xtc -dt 10 -o tfit.xtc
Extract a single frame for restart
gmx trjconv -s plm.tpr -f traj.part0001.trr -dump 1800 -o conf1800.gro
Concatenate trajectories
gmx trjcat -f traj.part*.xtc -o trajall.xtc
Combining trajectories from different simulations
trjcat -f R1/traj_r1_times.xtc R2/traj_r2_times.xtc -cat -nosort -o combine.xtc
the flags:
-cat : Do not discard double time frames
-nosort: Sort trajectory files (not frames)
To make the output trajectory (combine.xtc) more organized we can use the -timestep flag
trjcat -f R1/traj_r1_times.xtc R2/traj_r2_times.xtc -cat -nosort -o combine.xtc -timestep 1
-timestep : Change time step between input frames (ps)
Fix PBC problem (try one of them):
gmx trjconv -s ref.tpr -f trajall.xtc -pbc mol -o trajmol.xtc
gmx trjconv -s ref.tpr -f trajall.xtc -pbc nojump -o trajnoj.xtc
Align trajectory to the reference frame
trjconv -s ref.tpr -f trajmol.xtc -fit rot+trans -o trajfit.xtc
Calculating distance
Use "gmx distance -n index.ndx -f traj.xtc -s topol.tpr -oav -oall" with index file write in this way:
[CA_1 – CA_2]
300 402
Sparse out trajectory (write every 10 ps) or cut out certain interval
trjcat -f trajfit.xtc -dt 10 -o tfit.xtc
trjcat -f trajfit.xtc -b 100 -e 140 -o trajfit_interestingpart.xtc
Fix the volume for NVT simulation (after NPT equilibration)
gmx eneconv -f ener.part*.edr -o enerall.edr
gmx energy -f enerall.edr -o vol.xvg
xmgrace vol.xvg
head vol.xvg
grep '^....0' vol.xvg
grep '^...00' vol.xvg
(just a trick to see only points represented in the trr trajectory, every 10 or 100 ps)
Select a frame (near the end) that has a volume close to the average of the last 10 nanoseconds or so. Dump it as above.
Covariance analysis, projections on principal components (modes), and visualization of modes
gmx covar -s plm.tpr -f trajfit.xtc
gmx anaeig -s plm.tpr -v eigenvec.trr -f trajfit.xtc -proj proj.xvg -last 10
gmx anaeig -s plm.tpr -v eigenvec.trr -f trajfit.xtc -first 4 -last 4 -extr ext4.pdb -nframes 100
(gives an artificial trajectory between the extreme points of mode 4, use Rock in VMD to see it rocking)
#convert trajectory to xyz format, examples:
writexyz_delfirst ../protstart.pdb tfit.xtc
writexyz gpwtest.tpr ../tfit.xtc
writexyz -parm7 ../com_solvated.top -crdbox ../test.mdcrd
In case of .nc or netcdf file (.nc or netcdf format is binary whereas .mdcrd is ascii). In that case
writexyz -parm7 ../com_solvated.top -nc ../md_dry.nc
(uses VMD, so takes the same arguments as vmd, outfile is called out.xyz)
Convert AMBER trajectory to GROMACS or vice-versa or other general trajectory conversion
install mdtraj
$ sudo pip install mdtraj
if pip, numpy and cython is not pre-installed you need to install these packages.
sudo apt-get install python-pip
sudo pip install numpy cython
then execute
sudo pip install mdtraj
Upon installation you can exceute the conversion using the following command or similar arguments.
/usr/local/bin/mdconvert ~/Erik/plasmepsin_correct/R2/md_total.nc -o traj_r2.xtc
Extend Simulation with Plumed
#gmx convert-tpr extends the simulations by the amount of time in ps which is mentioned with the -extend flag. -s = input .tpr file -o = output .tpr file
gmx convert-tpr -s gpwtest.tpr -o gpwtest.tpr -extend 30000
Clustering analysis with G_cluster
g_cluster is capable clustering using RMSD.
/home/s/sbhakat/pfs/bin/gromacsbin462/bin/g_cluster_plumed_mpi_d -f ../../Test_combine/combine_all_times.xtc -s ../../prot0.gro -n ../../../1D_opt/index.ndx -xvg none -wcl 5 -cutoff 0.04 -dt 100 -method gromos -minstruct 150 -clid clust-id.xvg
4. Calculate free-energy surface after a simulation
One-dimensional:
plumed sum_hills --hills 'HILLS' --outfile fes --stride 100 --bin 100
bin = number of bins
stride = how often it outputs intermediate files (time between files = stride*pace*timestep)
Two-dimensional:
plumed sum_hills --hills HILLS --outfile fes --stride 500 --bin 100,100
Two-dimensional FES projected on one CV:
plumed sum_hills --hills HILLS --outfile fes --stride 500 --bin 100,100 --idw d1 --kt 2.494
Note : The result depends on the temperature (2.494 = kT with T=300K)
5. Plotting examples
(use “head COLVAR” to see what the columns mean)
gpu 1:2 COLVAR
gpu '1:($2*10)' COLVAR (note the quotes to protect $ from bash)
gpuwp 1:3 COLVAR
gpsev 3 COLVAR (assumes one x column, plots the next 3 columns)
gp fes*.dat (plots several files in the same graph)
Several separate plots in the same figure:
gnuplot> set multiplot layout 4,2
multiplot> sp 'fes2.dat' u 1:2:3
multiplot> sp 'fes3.dat' u 1:2:3
multiplot> sp 'fes4.dat' u 1:2:3
multiplot> sp 'fes5.dat' u 1:2:3
multiplot> sp 'fes6.dat' u 1:2:3
multiplot> sp 'fes7.dat' u 1:2:3
multiplot> sp 'fes8.dat' u 1:2:3
multiplot> sp 'fes9.dat' u 1:2:3
multiplot> unset multiplot
Plot two-dimensional FES:
gnuplot> set pm3d map
gnuplot> sp ‘fes1.dat’ u 1:2:3
Histogram
#Usage: histocol.py file column [nbin [ start stop] ]
histocol.py COLVAR 2 100 > hist1.dat
Running averages
#Usage: runav.py file windowlength
runav.py COLVAR 100 > COLVAR_av100
(averages all columns except the x column)
Average and standard deviation
nohead 1 COLVAR | head -500 | awk '{print $2}' |stdev
Test switching function (and plot histogram in the same plot)
p 'hist6.dat' u ($1*0.1):(0.003*$2) w l, (1-((x-d0)/r0)**n)/(1-((x-d0)/r0)**m), d0=0.29,n=6,m=12,r0=0.05
Install Matlab from an ISO image
mkdir Mathworks
sudo mount -o loop -t iso9660 ~/matlab/R2015a-glnxa64.iso Mathworks/ (give system password if asked)
cd Mathworks
./install
umount Mathworks
Unmount a directory
You can use the following command
fusermount -u /some/directory/
Here are the options of fusermount
fusermount: [options] mountpoint Options: -h print help -V print version -o opt[,opt...] mount options -u unmount -q quiet -z lazy unmount
A more detailed source to read regarding umount is here
Downloading big files from Cluster scp stalls problem
rsync -avz --progress sbhakat@aurora.lunarc.lu.se:/lunarc/nobackup/users/sbhakat/BACE/Metadynamics/3TPL/COM_TIP3P_DIST/trajfit.xtc Aurora_download/BACE/Metadynamics/3TPL/
File conversion:
moving the files starting with # to a new file mv "#traj.part0001.xtc.1#" traj.part0000.xtc Automatic job submission script for Gromacs #!/bin/sh # # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # # Job name #SBATCH -J HIV_gromacs # # No. of nodes and no. of processors per node #SBATCH -N 1 #SBATCH --tasks-per-node=16 #SBATCH --exclusive # # Time needed to complete the job #SBATCH -t 48:00:00 # module add gcc/4.8.1 module add openmpi/1.8.1/gcc/4.8.1 module add cuda/6.5 if [ -e state.cpt ]; then extra='-cpi state.cpt' fi # GMX-5.0.4 # # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # mpirun -npernode 16 /home/per/lun/gromacs5bin/bin/mdrun_mpi -s hiv300ns.tpr $extra -ntomp 1 -gpu_id 0000000011111111 -v -maxh 48 rc=$? echo rc $rc if [ $rc != 0 ]; then echo >stop; fi if [ -f stop ]; then exit fi if [ -f confout.* ]; then exit fi sleep 10m sbatch job.sh
To compare tICA output from MSMBuilder (RawpositionFeaturiser) with Plumed Driver use the TYPE=SIMPLE as TYPE=OPTIMAL although is perfectly linear it is doing some kind of normalization inside Plumed.
PCAVARS REFERENCE=reference_corrected.pdb TYPE=SIMPLE LABEL=tica PCAVARS REFERENCE=reference_corrected.pdb TYPE=OPTIMAL LABEL=ticaopt PRINT ARG=tica.* FILE=COLVAR PRINT ARG=ticaopt.* FILE=COLVAROPT
Task: Run a job over VPN even if VPN disconnects
nohup command &
Task: Numbering the DESRES trajectory
awk -v CONVFMT=%.5f '{a=a+0.00025; $1=a; print $0}' COLVAR_hbondnohead >COLVAR
Task: Find the max of a particular column
sort -nrk1,1 filename | head -1 | cut -d ' ' -f3 #this is for getting max for column 1