Molecular Docking: Tutorial

Docking with Autodock Vina: A step by step guide for Beginners or Advanced Users

Prepare the ligand:

Using MarvinSketch and OpenBabel:

MarvinSketch is an amazing JAVA based functionality which includes several basic and advanced functionalities and completely free for academic use. You can download the package from


1.1. Open the Marvin Sketch (MS)

Draw the ligand in MS or import the SMILE or InchI Keys or open an supplied ligand. MS supports SDF, MOL, MOL2, PDB and all other common file formats.

1.2. Adjust the protonation state of the ligand:

Click on MS and then Calculations > Protonation> Major Microscopes> a window will appear where you can input your desired pH and click OK and it should show a new window with the correct protonation state of the ligand at your desired pH. You can SAVE the ligand with correct protonation state in your desired file format.


If a window appeared saying that “The product will run in Evaluation Mode” then click OK. Open the saved ligand in Notepad/Wordpad or use vi (using Mac or Linux) to see if there is hydrogen in the saved file or not.

1.3. Now it is time to minimize the ligand:

Install a well used and handy tool named OpenBabel and use it for the minimization procedure.

Optimize the molecule using MMFF94:

babel -isdf lig.sdf -h -p 6.5 --gen3D --conformer --ff MMFF94s -osdf lighopt.sdf
1.4. Convert to PDB or Mol2 format for visual inspection using OpenBabel:
babel -i lighopt.sdf -opdb lighopt.pdb 
Imagine if you download a big Dataset and want to use all the ligands during the docking process. In that case follow as mentioned below:

Step 1: Split the molecules

babel -isdf dataset.sdf -osdf lig.sdf -m #This will split datset.sdf -m option does the splitting

Step 2: Convert to .mol2 format

#This will convert 11320 ligands into separate .mol2 files
for i in {1..11320}; do
babel -isdf lig$i.sdf -omol2 lig$i.mol2

Step 3: Generate approximated 3D conformation for all the individual .mol2 files and add Hydrogen based on a chosen pH

for i in {1..11320}; do
babel -imol2 ../../lig$i.mol2 -h -p 6.5 --gen3D --conformer --ff MMFF94s -omol2 ligh$i.mol2

Step 4: Generate PDBQT from .mol2

for i in {1..11320}; do
babel -imol2 ../Add-H/Minimized/ligh$i.mol2 -opdbqt lig$i.pdbqt
Prepare the receptor:

Install MGL Tools and UCSF Chimera

How to identify binding site in case of apo structure where there is less information about the binding site. You can use a server named SiteHound Web which can rank the binding sites based on the energy cluster. You can upload the PDB structure (remove waters and other non-standard residues before) and upload. The server will generate an output like the Figure 1. You can see it also gives dimension of Center X, Y and Z corresponds to each binding site which you can use as Autodock Vina input (in the VS.bash file, scroll down and follow the tutorial). You can click on each cluster and see the residues correspond to that cluster.
Figure 1. The SitehoundWeb results interface.
Some other servers for binding site recognition:
I like the server named MetaPocket but more servers and softwares are available with
1. Download the receptor from the RCSB PDB.

2. Open the receptor in UCSF Chimera. You can see that the non-terminal missing residues are highlighted as dot marks.

Now it is time to add missing terminal residues: Tools > Structure editing> Model/Refine loops>
It will open up a new window. Check the boxes as follows:
Model/remodel: non-terminal missing structure
Allow this many residues adjacent to missing regions to move : 1
Number of models to generate: 5
Loop modelling protocol: standard
Run Modeller using: Choose either web service or local installation
Modeller license key: Put the modeller license key which you can get from MODELLER website

Click OK

You can see the job progress below the Chimera main window <left bottom corner>.

 The Modeller run may take several minutes and is handled as a background task. Clicking the information icon near the bottom of the Chimera window will bring up the Task Panel, in which the job can be canceled if desired.

When the five models have been generated, they will be opened in Chimera and their evaluation scores shown in a Model List dialog. The models can be viewed individually or collectively by choosing rows in the dialog with the mouse. The different scores from Modeller use different criteria and will not necessarily agree on which models are best:

    GA341 – model score derived from statistical potentials; a value > 0.7 generally indicates a reliable model, >95% probability of having the correct fold
    zDOPE – normalized Discrete Optimized Protein Energy (DOPE), an atomic distance-dependent statistical score; negative values indicate better models

Although Modeller scores were developed for globular proteins and thus have limited applicability to the trans-membrane protein in this tutorial, another reason for the poor (positive) zDOPE scores is that the termini extend far beyond the template structure and cannot be modeled reliably. Displaying all the models at once shows little conformational variability except in the termini, and to a lesser extent, the untemplated part of the third intracellular loop. This conclusion is reinforced by the RMSD histogram in the sequence window, where bar heights indicate root-mean-square distances among the α-carbons of the residues associated with a column.

Choose the model number with highest Estimated Overlap and a relatively low z-score and save the model as PDB (remove all other models).

Check here:

3. Setting up the GRID box with Autodock Tools:

Open AutodockTools

Click File > Read Molecule > open the receptor PDB file
Once it is open you can expand the receptor PDB by clicking on the arrow in the Molecules panel.
Figure 2. Autodock Tools Window when you open the protein.
Select the active site residues around which you want to build the grid box by selecting the residues and make a Space Fill representation.


Figure 3. On the Molecules window you can click on B and C and make a space fill representation.
Now it is time to make the GRID box.

Click on Grid > Grid Box

And you will see the Grid Box in the GUI.

Now you can adjust the Number of Points and Centre of Grid X, Y and Z.

I would suggest to keep the spacing at 1.0 Angstrom and the number of points in X, Y and Z dimension not more than 22, 22 and 22 unless it is a big binding site.

Figure 4. The GRID set up window in AutoDock Tools.
Once you set the GRID properly just note down the X, Y and Z values for number of points and Centre Grid Box.

4. PDBQT file preparation

Download the RACCOON and open the GUI.
 Follow instructions here

You can upload multiple ligands file and the receptor file and raccoon will add charges and will create the PDBQT files.

Now we are good to for docking.

5. Docking

We will use Autodock Vina for docking as it can dock multiple ligands with a receptor within a short period of time. You can see the installation instruction in the Autodock Vina webpage

Now you can create a file named conf.txt with the following input
#give the name of the receptor 
receptor = rec.pdbqt
exhaustiveness  = 8
#Put the X, Y, Z dimensions from Centre Grid Box
center_x = 46.00
center_y = -40.00
center_z = 44.00
#Put the X, Y and Z dimensions from number of points
size_x = 22
size_y = 22
size_z = 22
Then you need a script to automatically dock multiple ligands using Autodock Vina

I name this script VS.bash (make it executable using chmod +x VS.bash in Mac or Linux)

for f in lig*.pdbqt; do  b=`basename $f .pdbqt`; echo Processing ligand $b; mkdir -p data01; vina --config conf.txt --ligand $f --out data01/$f.pdbqt --log data01/$f.txt; done
#result analysis
cd data01
grep "   1 " *.txt | cut -c35-42,1-12 >>result
sort -rt" " +2 result >result2
cat result2
*The VS.bash script can dock multiple ligands in the single receptor which makes it perfect for virtual screening workflow. 
Note: Make sure you have the correct path for vina

Now you are ready to start the docking.

Execute the VS.bash using the command ./VS.bash

You can see the different docking poses and their binding free energies with the receptor something like the following:

# If you used AutoDock Vina in your work, please cite:          #
#                                                               #
# O. Trott, A. J. Olson,                                        #
# AutoDock Vina: improving the speed and accuracy of docking    #
# with a new scoring function, efficient optimization and       #
# multithreading, Journal of Computational Chemistry 31 (2010)  #
# 455-461                                                       #
#                                                               #
# DOI 10.1002/jcc.21334                                         #
#                                                               #
# Please see for more information.      #

Detected 4 CPUs
Reading input ... done.
Setting up the scoring function ... done.
Analyzing the binding site ... done.
Using random seed: 1357485783
Performing search ... done.
Refining results ... done.

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
   1         -7.8      0.000      0.000
   2         -7.8      2.411      7.553
   3         -7.7      3.009      7.400
   4         -7.6      3.820      5.984
   5         -7.4      2.771      8.463
   6         -7.3      3.226      7.606
   7         -7.1      2.947      8.640
   8         -7.1      1.984      3.930
   9         -7.1      2.884      8.842
Writing output ... done.
6. How to see the docking poses?

Open Chimera> File> Open the rec.pdb

Click Tools>Surface/Binding Analysis>ViewDock>Go to data folder and open the filed with extension .pdbqt.pdbqt

7. How to SAVE the docked pose?

Save the docked complex as shown below

Screen Shot 2016-10-25 at 22.24.30.png

Figure 5. Chimera Window during saving the docked ligand+receptor.
1.Work inside a specific folders while executing all these steps.
2. Alternatively you can use Mcule 1-Click Docking and their Structure based and Ligand Based Drug Design protocols. Go here
Important points to remember during docking:
In order to prove that some docking protocol is better than other it is better to participate in blind prediction challenges such as D3R Grand Challenges or SAMPL challenges. In my point of view testing all the docking software’s and protocols in the blind prediction challenge will be the ultimate test for all the docking softwares.
In some cases the docking poses are unstable specially when there is a lot of rotatable  bonds in the ligands hence we need a long enough MD simulation ( in the range of 1 microsecond or more) to characterize all the possible binding poses which are energetically favored or use enhanced sampling techniques such as Metadynamics.
*Specific suggestions to improve the tutorial will be much appreciated with proper credit.

1 reply »

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s