Analysis of MD trajectory using tSNE

Before starting anything with tSNE let’s read what is tSNE and how it has been compared with PCA. You can read it here. Several implementations of t-SNE are available here. A great introductory video on tSNE can be found here.

The dataset used in this explanation can be accessed here (named combine_times_ca.dcd and corresponding GRO file prot_ca.gro . Use VMD to open them and crosscheck).

# Download Matlab_r2017b

# Add the path of .dcd file reader for MatLab.Download the package from here


# Give the path of your dcd file. In my case I am using a dcd file named combine_times_ca.dcd which has atoms starting from 1 to 331.


This will produce a following output

h =

struct with fields:

fid: 3
 endoffile: 42789396
 NSET: 10560
 charmm: 1
 charmm_extrablock: 1
 charmm_4dims: 0
 N: 331

# Perform Pincipal Component Analysis

[pc, score, latent, tsquare] = pca(x(2:end,:));

# Plot first two principal components


# Label the plot


# It will pop up a window with PCA plot something the following


# Carrying on the calculation on the same Matlab window

rng default % for reproducibility

# Perform tSNE analysis with Barneshut algorithm

Y = tsne(x,'Algorithm','barneshut','NumPCAComponents',50);

#Produce the figure


# It will produce something like the following



The initial part of the tutorial was inspired by this one.

Collaboration on use of tSNE in molecular dynamics simulation is highly appreciated.


Jupyter notebook and Matplotlib

I was thinking to use Jupyter notebook for quite a some time and finally I started using it. Here is a quick installation guide for Linux (Ubuntu).

If Python is already installed

pip3 install --upgrade pip

pip3 install jupyter

Also you can install it with Anaconda. Details here 

To run notebook 

jupyter notebook

This will open up a webpage in your default browser and you can click on New and create a new notebook.

Gnuplot with Jupyter

I always wanted to have a gnuplot module integrated with Jupyter hence I installed the Gnuplot kernel with Jupyter notebook. The instructions are here


  1. System installation og Gnuplot in either Linux or Mac
  2. Installation of Jupyter notebook
  3. Metakernel

Install Metakernel by following command

pip install metakernel --upgrade 

Installation of Gnuplot kernel

pip install git+

Now when you will open jupyter notebook you will see Python 3 and Gnuplot option under New.

You can use bash within Jupyter notebook cell by starting with the following command


Matplotlib with Jupyter

Jupyter uses matplotlib as one of the default plotting programme. If you have a .dat file (space separated) you can use something like following and click Run to see the plot. Make sure you have numpy, pandas installed.

import pandas as pd
import numpy as np
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline'ggplot')

# read data

nDataPts = len(NPT.index);
selFrames = np.arange(0,nDataPts,10);

# compute running average 
def movingaverage(interval, window_size):
    window = np.ones(int(window_size))/float(window_size);
    return np.convolve(interval, window, 'same');
p_ma = movingaverage(NPT['PRESS'],1000);

plt.scatter(NPT["TIME(PS)"][selFrames], NPT["PRESS"][selFrames],alpha=0.3, edgecolors='none');
plt.xlabel('time (ps)'); plt.ylabel('pressure (bar)'); plt.title('pressure fluctuations during NPT simulation');


Table of contents in Jupyter

Often Jupyter notebook gets long enough by inclusion of plots and figures. In order to navigate easily within the notebook it needs a hyperlinked Table of Contents. All the jupyter extensions such as Table of Contents can be downloaded from here.


sudo pip install
jupyter contrib nbextension install --user

Now load the jupyter notebook. You can then open the nbextensions tab on the tree (dashboard/file browser) notebook page to configure nbextensions. You will have access there to a dashboard where extensions can be enabled/disabled via checkboxes. Additionally a short documentation for each extension is displayed, and configuration options are presented.

Note: To be continued


Useful scripts

Disclaimer: All these scripts were written by Emil Åberg
Imagine you have multiple folders with rec.pdbqt and rec.pdb files and have a folder with all the ligands with .pdbqt files. Imagine now you want to dock all the ligands with receptors present in all the folders. You can use a script something like the following
#3fli, 3fxv name of the folders which contain receptor pdb and pdbqt files
for i in 3fli 3fxv 3l1b 3okh 3olf 3omk 3omm 3oof 4qe6 4qe8; do

cd $i

cd ..

the VS.bash file looks something like following
#Ligands are situated at ../Ligands/Withoutcharges/ folder
for f in lig{1..36}_noch.pdbqt; do  b=`basename $f .pdbqt`; echo Processing ligand $b; mkdir -p data01; cp ../Ligands/Withoutcharges/$f . ; vina --config conf.txt --ligand $f --out data01/$f.pdbqt --log data01/$f.txt; done
#result analysis
cd data01
grep "   1 " *.txt | cut -c35-42,1-12 >>result
sort -rt" " +2 result >result2
cat result2
Now imagine in case of each receptor (for each folder) you want to list the binding affinity of all the 36 receptors in a text file you might like to use the following script
for a in 1osh_2 3fli 3fxv 3l1b 3okh 3olf 3omk 3omm 3oof 4qe6 4qe8 3dct 3dcu 3hc5 3hc6 3p88 3rut 3ruu_noanisou apo; do
cd $a/
echo -n >affinities_$a.txt
   cd data01/
    for i in {1..36}; do

               grep ' 1 ' lig${i}_noch.pdbqt.txt | awk '{print $2}' >>../affinities_$a.txt

   cd ../../

    echo "Rec $a" 



Modelling of ZIKA proteases

Disclaimer: This blog is made public in order to inspire a new generation of enthusiastic people to start working on this neglected disease as soon as possible.

I was doing some side-project especially with flaviviruses and the drug discovery targeting flavivirus proteases. That’s why ZIKA break into the scene of my research as ZIKA is likely to be the next big target in the field of drug discovery and development.

Due its similarity with dengue and hepatitis C the drug hunters have some obvious target in their mind which are targeting NS2B-NS3 and NS5 non-structural proteases. Keeping that in mind I designed two protein structures using homology modeling by Swiss-Model server.

  1. Model of NS5: The sequence of ZIKA NS5 (RNA dependent RNA polymerase) was retrieved from NCBI curated database for ZIKA (Accession Id: YP_009227205.1 and link: ) and submitted into the Swiss-Model server. The homology modeled structure for ZIKA NS5 was described in Figure 1. The homology model was generated using the crystal structure of Of the full-length Japanese Encephalitis Virus Ns5 (PDB: 4K6M)which has a 69.79% sequence identity and 98% query coverage in the blast run.zika_ns5

Figure 1. 3D homology modeled structure of ZIKA NS5 protease using the template of PDB: 4K6M. The nucleotide was introduced in the binding site to map the binding site residues.

The ligand interaction map (Figure 2) of ZIKA NS5 highlighted plausible binding site residues present in the active site of RNA dependent RNA polymerase.


Figure 2. Ligand interaction map ZIKA NS5 active site residues with nucleoside.

The quality assessments for the NS5 structure is presented below (Figure 3)




Figure 3. Several quality assessment parameter for the homology modeled ZIKA NS5 structure.

The assessment of the structure resulted led me to believe that the modeled structure is well predicted.

2. Model of NS2B-NS3 protease: The model of ZIKA NS2B-NS3 pro was designed using the template of ligand bound structure of Dengue NS2B-NS3 (PDB: 3U1I). A 3D ribbon representation of ZIKA NS2B-NS3 pro is presented in Figure 4.


Figure 4. The 3D ribbon representation of ZIKA NS2B-NS3 pro using the template structure of Dengue serine protease (PDB: 3 U1I).

Here is the sequence of ZIKA NS2B-NS3 pro:




I ran a 27ns NVT MD simulation to refine the structural model and then uploaded the structure in the MolProbity geometry analysis. The analysis report is mentioned with the attached PDF link prot27000_1-rama

These are just the very basic work but it gives an idea about the binding site which is clearly defined in case of both NS3 and NS5. I realized that there might be a bit more computational study necessary to refine the homology modeled structure but keeping in mind the defined active site, I believe it will be a great starting point for virtual screening and pharamacophore based drug design campaign.

Note: If you are using any of this material for your science or inspired by this short piece of work please cite it accordingly. Happy drug hunting!!

New review on “Flap Dynamics”

Our latest review deals with computational study on “Flap dynamics” across proteases has been accepted in Chemical Biology & Drug Design.

Drug re-purposing by targeting Dengue and CHIKV

One of our long standing dream to use drug re-purpose or repositioning strategy on neglected tropical diseases such as Dengue or CHIKV was finally successful with the publication of the latest article entitled “Reaching beyond HIV/HCV: nelfinavir as a potential starting point for broad-spectrum protease inhibitors against dengue and chikungunya virus” in RSC Advances (LINK). We are in process to further boost the selectivity index using computer aided drug design strategy.

MD study to understand the effect of T68A/N126Y mutations on CVB3 3C protease


One of the first paper which described the effect of T68A/N126Y double mutations got accepted recently in Molecular Biosystems. The paper entitled “Effect of T68A/N126Y mutations on conformational and ligand binding landscape of Coxsackievirus B3 3C protease” will hopefully provide some critical insights into molecular mechanism of mutations which affects drug design towards Coxsackievirus B3 3C protease.