Docking with Autodock Vina: A step by step guide for Beginners or Advanced Users

Prepare the ligand:
Using MarvinSketch and OpenBabel:
MarvinSketch is an amazing JAVA based functionality which includes several basic and advanced functionalities and completely free for academic use. You can download the package from https://www.chemaxon.com/products/marvin/marvinsketch/
Steps:
1.1. Open the Marvin Sketch (MS)
Draw the ligand in MS or import the SMILE or InchI Keys or open an supplied ligand. MS supports SDF, MOL, MOL2, PDB and all other common file formats.
1.2. Adjust the protonation state of the ligand:
Click on MS and then Calculations > Protonation> Major Microscopes> a window will appear where you can input your desired pH and click OK and it should show a new window with the correct protonation state of the ligand at your desired pH. You can SAVE the ligand with correct protonation state in your desired file format.
Note:
If a window appeared saying that “The product will run in Evaluation Mode” then click OK. Open the saved ligand in Notepad/Wordpad or use vi (using Mac or Linux) to see if there is hydrogen in the saved file or not.
1.3. Now it is time to minimize the ligand:
Install a well used and handy tool named OpenBabel http://openbabel.org/wiki/Category:Guides and use it for the minimization procedure.
Optimize the molecule using MMFF94:
babel -isdf lig.sdf -h -p 6.5 --gen3D --conformer --ff MMFF94s -osdf lighopt.sdf
1.4. Convert to PDB or Mol2 format for visual inspection using OpenBabel:
babel -i lighopt.sdf -opdb lighopt.pdb
Imagine if you download a big Dataset and want to use all the ligands during the docking process. In that case follow as mentioned below:
Step 1: Split the molecules
babel -isdf dataset.sdf -osdf lig.sdf -m #This will split datset.sdf -m option does the splitting
Step 2: Convert to .mol2 format
#This will convert 11320 ligands into separate .mol2 files for i in {1..11320}; do babel -isdf lig$i.sdf -omol2 lig$i.mol2 done
Step 3: Generate approximated 3D conformation for all the individual .mol2 files and add Hydrogen based on a chosen pH
for i in {1..11320}; do babel -imol2 ../../lig$i.mol2 -h -p 6.5 --gen3D --conformer --ff MMFF94s -omol2 ligh$i.mol2 done
Step 4: Generate PDBQT from .mol2
for i in {1..11320}; do babel -imol2 ../Add-H/Minimized/ligh$i.mol2 -opdbqt lig$i.pdbqt done
Prepare the receptor:
Install MGL Tools http://mgltools.scripps.edu/ and UCSF Chimera https://www.cgl.ucsf.edu/chimera/
How to identify binding site in case of apo structure where there is less information about the binding site. You can use a server named SiteHound Web http://scbx.mssm.edu/sitehound/sitehound-web/Input.html which can rank the binding sites based on the energy cluster. You can upload the PDB structure (remove waters and other non-standard residues before) and upload. The server will generate an output like the Figure 1. You can see it also gives dimension of Center X, Y and Z corresponds to each binding site which you can use as Autodock Vina input (in the VS.bash file, scroll down and follow the tutorial). You can click on each cluster and see the residues correspond to that cluster.
Figure 1. The SitehoundWeb results interface.
Some other servers for binding site recognition:
I like the server named MetaPocket http://projects.biotec.tu-dresden.de/metapocket/ but more servers and softwares are available with http://www.click2drug.org/index.html#Binding%20site%20prediction
1. Download the receptor from the RCSB PDB.
2. Open the receptor in UCSF Chimera. You can see that the non-terminal missing residues are highlighted as dot marks.
Now it is time to add missing terminal residues: Tools > Structure editing> Model/Refine loops>
It will open up a new window. Check the boxes as follows:
Model/remodel: non-terminal missing structure
Allow this many residues adjacent to missing regions to move : 1
Number of models to generate: 5
Loop modelling protocol: standard
Run Modeller using: Choose either web service or local installation
Modeller license key: Put the modeller license key which you can get from MODELLER website https://salilab.org/modeller/
Click OK
You can see the job progress below the Chimera main window <left bottom corner>.
The Modeller run may take several minutes and is handled as a background task. Clicking the information icon near the bottom of the Chimera window will bring up the Task Panel, in which the job can be canceled if desired.
When the five models have been generated, they will be opened in Chimera and their evaluation scores shown in a Model List dialog. The models can be viewed individually or collectively by choosing rows in the dialog with the mouse. The different scores from Modeller use different criteria and will not necessarily agree on which models are best:
GA341 – model score derived from statistical potentials; a value > 0.7 generally indicates a reliable model, >95% probability of having the correct fold
zDOPE – normalized Discrete Optimized Protein Energy (DOPE), an atomic distance-dependent statistical score; negative values indicate better models
Although Modeller scores were developed for globular proteins and thus have limited applicability to the trans-membrane protein in this tutorial, another reason for the poor (positive) zDOPE scores is that the termini extend far beyond the template structure and cannot be modeled reliably. Displaying all the models at once shows little conformational variability except in the termini, and to a lesser extent, the untemplated part of the third intracellular loop. This conclusion is reinforced by the RMSD histogram in the sequence window, where bar heights indicate root-mean-square distances among the α-carbons of the residues associated with a column.
Choose the model number with highest Estimated Overlap and a relatively low z-score and save the model as PDB (remove all other models).
Check here: https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/tutorials/dor.html
3. Setting up the GRID box with Autodock Tools:
Open AutodockTools
Click File > Read Molecule > open the receptor PDB file
Once it is open you can expand the receptor PDB by clicking on the arrow in the Molecules panel.
Figure 2. Autodock Tools Window when you open the protein.
Select the active site residues around which you want to build the grid box by selecting the residues and make a Space Fill representation.
Figure 3. On the Molecules window you can click on B and C and make a space fill representation.
Now it is time to make the GRID box.
Click on Grid > Grid Box
And you will see the Grid Box in the GUI.
Now you can adjust the Number of Points and Centre of Grid X, Y and Z.
I would suggest to keep the spacing at 1.0 Angstrom and the number of points in X, Y and Z dimension not more than 22, 22 and 22 unless it is a big binding site.
Figure 4. The GRID set up window in AutoDock Tools.
Once you set the GRID properly just note down the X, Y and Z values for number of points and Centre Grid Box.
4. PDBQT file preparation
Download the RACCOON and open the raccoon.py GUI.
Follow instructions here http://autodock.scripps.edu/resources/raccoon
You can upload multiple ligands file and the receptor file and raccoon will add charges and will create the PDBQT files.
Now we are good to for docking.
5. Docking
We will use Autodock Vina for docking as it can dock multiple ligands with a receptor within a short period of time. You can see the installation instruction in the Autodock Vina webpage http://vina.scripps.edu/index.html
Now you can create a file named conf.txt with the following input
#give the name of the receptor receptor = rec.pdbqt exhaustiveness = 8 #Put the X, Y, Z dimensions from Centre Grid Box center_x = 46.00 center_y = -40.00 center_z = 44.00 #Put the X, Y and Z dimensions from number of points size_x = 22 size_y = 22 size_z = 22
Then you need a script to automatically dock multiple ligands using Autodock Vina
I name this script VS.bash (make it executable using chmod +x VS.bash in Mac or Linux)
#VS.bash #!/bin/bash for f in lig*.pdbqt; do b=`basename $f .pdbqt`; echo Processing ligand $b; mkdir -p data01; vina --config conf.txt --ligand $f --out data01/$f.pdbqt --log data01/$f.txt; done #result analysis cd data01 grep " 1 " *.txt | cut -c35-42,1-12 >>result #sort sort -rt" " +2 result >result2 cat result2
*The VS.bash script can dock multiple ligands in the single receptor which makes it perfect for virtual screening workflow.
Note: Make sure you have the correct path for vina
Now you are ready to start the docking.
Execute the VS.bash using the command ./VS.bash
You can see the different docking poses and their binding free energies with the receptor something like the following:
################################################################# # If you used AutoDock Vina in your work, please cite: # # # # O. Trott, A. J. Olson, # # AutoDock Vina: improving the speed and accuracy of docking # # with a new scoring function, efficient optimization and # # multithreading, Journal of Computational Chemistry 31 (2010) # # 455-461 # # # # DOI 10.1002/jcc.21334 # # # # Please see http://vina.scripps.edu for more information. # ################################################################# Detected 4 CPUs Reading input ... done. Setting up the scoring function ... done. Analyzing the binding site ... done. Using random seed: 1357485783 Performing search ... done. Refining results ... done. mode | affinity | dist from best mode | (kcal/mol) | rmsd l.b.| rmsd u.b. -----+------------+----------+---------- 1 -7.8 0.000 0.000 2 -7.8 2.411 7.553 3 -7.7 3.009 7.400 4 -7.6 3.820 5.984 5 -7.4 2.771 8.463 6 -7.3 3.226 7.606 7 -7.1 2.947 8.640 8 -7.1 1.984 3.930 9 -7.1 2.884 8.842 Writing output ... done.
6. How to see the docking poses?
Open Chimera> File> Open the rec.pdb
Click Tools>Surface/Binding Analysis>ViewDock>Go to data folder and open the filed with extension .pdbqt.pdbqt
7. How to SAVE the docked pose?
Save the docked complex as shown below
Thank you !!
Great tutorial
LikeLike