tICA Metadynamics

Time-lagged independent component analysis is a dimensionality reduction technique which transforms a high dimensional dataset to lower dimensional co-ordinates. It’s philosophy is kind of similar to PCA (principal component analysis).

For more details check here.

It is a very recently developed technique which has a lot of promise. tICA can separate slow degrees of freedom which can be used as an input in metadynamics or any other enhanced sampling techniques for better sampling of slow transitions.

Note: Google all the recent papers which used tICA technique and you will find this technique pretty interesting. However, it needs to be tested in large scale systems to actually say something about its advantage over PCA.

This is my own tutorial where I will show how to generate tICA from a simulation and render the tICA co-ordinates as Plumed input file.

You will need following softwares before we start the tutorial.

Python 3.4 or higher
Jupyter Notebook
MSM Builder 3.8 or higher
MSM explorer
MDTraj 1.9.0

Here we go:

Download the trajectory and topology (in this case PDB file) from here

Make sure the path of input is correct. Open a new Python 3 terminal in Jupyter Notebook

#Importing all the modules
#msmbuilder imports 
from msmbuilder.dataset import dataset
from msmbuilder.featurizer import ContactFeaturizer
from msmbuilder.decomposition import tICA
from msmbuilder.cluster import MiniBatchKMeans
from msmbuilder.msm import ContinuousTimeMSM
from msmbuilder.utils import verbosedump,verboseload

#other imports
import os,glob
import numpy as np
import mdtraj as md
import pandas as pd 
import pickle

#prettier plots
import seaborn as sns
%pylab inline
matplotlib.rcParams['xtick.labelsize'] = 14 
matplotlib.rcParams['ytick.labelsize'] = 14 
matplotlib.rcParams['font.family'] = "sans-serif"

#It is time to import the trajectory and the topology files
trajectory_folder_path=("*.xtc")
t = md.load('../tfit.xtc', top='../prot0.pdb')
ds = dataset('../tfit.xtc', topology='../prot0.pdb')

#get some stats about the dataset
print("Number of trajectories is %d"%len(ds))

This will give a output like

Number of trajectories is 1

#define our featurizer. I want to use the distance as a CV
residue_ind = np.array([[35, 79],[215, 293]])
feat = ContactFeaturizer(scheme='ca', contacts= np.array([[35, 79],[215, 293]]))

#transform the data with the featurizer
ds_contact=ds.fit_transform_with(feat, "res_contact/",fmt='dir-npy')

#we can load the features using
ds_contact = dataset("./res_contact/")

#see how many trajectories were retained during featurization
print(len(ds_contact),len(ds))

You will see an output like following

1 1

#define our tica model
tica_mdl = tICA(lag_time=1000,n_components=2)

#transform the dataset with tica
tica_features = ds_contact.fit_transform_with(tica_mdl, out_ds = 'reg_ticas_1000lt_2tics//')

#load the features
tica_features = dataset("./reg_ticas_1000lt_2tics/")

#for each feature from our contact featurization we now have 2 tICs
tica_features[0].shape

Note: Choose the lag_time carefully. Go through the literatures to know more about lag time and set it according to your suitability.

Output

(6809, 2)

#lets dump the tica mdl for future use
verbosedump(tica_mdl,"tica_mdl.pkl")

Output

Saving "tica_mdl.pkl"... (<class 'msmbuilder.decomposition.tica.tICA'>)

Time to render tICA model in a Plumed like file

import pandas as pd 
import mdtraj as md 
from msmbuilder.utils import load
from msmbuilder.featurizer import ContactFeaturizer
from tica_metadynamics.plumed_writer import render_tica_plumed_file

#python imports
import os,glob
import numpy as np
import pickle

#make sure that you have already fit the model using the notebooks above.
tica_model = load("./tica_mdl.pkl")
top = md.load("../prot0.pdb")

# swap this for whatever you have. The code for now supports contacts, dihedral, and angles. 
feat =ContactFeaturizer([[35,79],[215,293]], scheme='ca')

# this basically maps every feature to atom indices. 
df1 = pd.DataFrame(feat.describe_features(top))

print(df1)

Output

atominds featuregroup featurizer otherinfo resids resnames \
0 [[534], [1203]] ca Contact 20 [35, 79] [ASP, VAL] 
1 [[3336], [4558]] ca Contact 20 [215, 293] [ASH, LEU]

resseqs 
0 [36, 80] 
1 [216, 294]

# to see the first 2 contacts for example 
print(df1.head())

output  = render_tica_plumed_file(tica_mdl= tica_model, df= df1, n_tics= 2, multiple_tics=None)

# output is a dictionary whose integer values are the tICs
# to get the 0th tIC
print(output[0])

# to get the 1st tIC
print(output[1])

Output

RESTART
DISTANCE ATOMS=535,1204 LABEL=ca_35_79 

DISTANCE ATOMS=3337,4559 LABEL=ca_215_293 

MATHEVAL ARG=ca_35_79 FUNC=x-1.53548354803 LABEL=meanfree_None_ca_35_79 PERIODIC=NO 

MATHEVAL ARG=ca_215_293 FUNC=x-1.49968935084 LABEL=meanfree_None_ca_215_293 PERIODIC=NO 

COMBINE LABEL=tic0 ARG=meanfree_None_ca_35_79,meanfree_None_ca_215_293 COEFFICIENTS=-0.138211254458,-11.6562007437 PERIODIC=NO 
METAD ARG=tic0 SIGMA=0.2 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=50
PRINT ARG=tic0,metad.bias STRIDE=1000 FILE=BIAS 
RESTART
DISTANCE ATOMS=535,1204 LABEL=ca_35_79 

DISTANCE ATOMS=3337,4559 LABEL=ca_215_293 

MATHEVAL ARG=ca_35_79 FUNC=x-1.53548354803 LABEL=meanfree_None_ca_35_79 PERIODIC=NO 

MATHEVAL ARG=ca_215_293 FUNC=x-1.49968935084 LABEL=meanfree_None_ca_215_293 PERIODIC=NO 

COMBINE LABEL=tic1 ARG=meanfree_None_ca_35_79,meanfree_None_ca_215_293 COEFFICIENTS=-2.78923032711,-0.309608739165 PERIODIC=NO 
METAD ARG=tic1 SIGMA=0.2 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=50
PRINT ARG=tic1,metad.bias STRIDE=1000 FILE=BIAS

A little bit of adjustment will give you a Plumed file which you can use for Metadynamics simulations

DISTANCE ATOMS=535,1204 LABEL=d1
DISTANCE ATOMS=3337,4559 LABEL=d2

MATHEVAL ARG=d1 FUNC=x-1.53548354803 LABEL=c1 PERIODIC=NO 
MATHEVAL ARG=d2 FUNC=x-1.49968935084 LABEL=c2 PERIODIC=NO 

COMBINE LABEL=tic0 ARG=c1,c2 COEFFICIENTS=-0.138211254458,-11.6562007437 PERIODIC=NO
COMBINE LABEL=tic1 ARG=c1,c2 COEFFICIENTS=-2.78923032711,-0.309608739165 PERIODIC=NO 

UPPER_WALLS ARG=d1,d2 AT=2.2,1.65 KAPPA=5000.0,5000.0 EXP=4,4 EPS=1,1 OFFSET=0,0 LABEL=uwall
LOWER_WALLS ARG=d1,d2 AT=1.0,1.3 KAPPA=5000.0,5000.0 EXP=4,4 EPS=1,1 OFFSET=0,0 LABEL=lwall
 
#METAD ARG=tic0,tic1 SIGMA=0.02,0.02 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=15
PRINT ARG=tic0,tic1,d1,d2,c1,c2 STRIDE=1 FILE=COLVAR

Try to run plumed driver using the above input to confirm if everything is okay or not.

Life in Computational Biology

tICA Metadynamics

Leave a comment Cancel reply

tICA Metadynamics

Share this:

Leave a comment Cancel reply