Time-lagged independent component analysis is a dimensionality reduction technique which transforms a high dimensional dataset to lower dimensional co-ordinates. It’s philosophy is kind of similar to PCA (principal component analysis).
For more details check here.
It is a very recently developed technique which has a lot of promise. tICA can separate slow degrees of freedom which can be used as an input in metadynamics or any other enhanced sampling techniques for better sampling of slow transitions.
Note: Google all the recent papers which used tICA technique and you will find this technique pretty interesting. However, it needs to be tested in large scale systems to actually say something about its advantage over PCA.
This is my own tutorial where I will show how to generate tICA from a simulation and render the tICA co-ordinates as Plumed input file.
You will need following softwares before we start the tutorial.
- Python 3.4 or higher
- Jupyter Notebook
- MSM Builder 3.8 or higher
- MSM explorer
- MDTraj 1.9.0
Here we go:
Download the trajectory and topology (in this case PDB file) from here
Make sure the path of input is correct. Open a new Python 3 terminal in Jupyter Notebook
#Importing all the modules #msmbuilder imports from msmbuilder.dataset import dataset from msmbuilder.featurizer import ContactFeaturizer from msmbuilder.decomposition import tICA from msmbuilder.cluster import MiniBatchKMeans from msmbuilder.msm import ContinuousTimeMSM from msmbuilder.utils import verbosedump,verboseload #other imports import os,glob import numpy as np import mdtraj as md import pandas as pd import pickle #prettier plots import seaborn as sns %pylab inline matplotlib.rcParams['xtick.labelsize'] = 14 matplotlib.rcParams['ytick.labelsize'] = 14 matplotlib.rcParams['font.family'] = "sans-serif"
#It is time to import the trajectory and the topology files trajectory_folder_path=("*.xtc") t = md.load('../tfit.xtc', top='../prot0.pdb') ds = dataset('../tfit.xtc', topology='../prot0.pdb') #get some stats about the dataset print("Number of trajectories is %d"%len(ds))
This will give a output like
Number of trajectories is 1
#define our featurizer. I want to use the distance as a CV residue_ind = np.array([[35, 79],[215, 293]]) feat = ContactFeaturizer(scheme='ca', contacts= np.array([[35, 79],[215, 293]])) #transform the data with the featurizer ds_contact=ds.fit_transform_with(feat, "res_contact/",fmt='dir-npy') #we can load the features using ds_contact = dataset("./res_contact/") #see how many trajectories were retained during featurization print(len(ds_contact),len(ds))
You will see an output like following
1 1
#define our tica model tica_mdl = tICA(lag_time=1000,n_components=2) #transform the dataset with tica tica_features = ds_contact.fit_transform_with(tica_mdl, out_ds = 'reg_ticas_1000lt_2tics//') #load the features tica_features = dataset("./reg_ticas_1000lt_2tics/") #for each feature from our contact featurization we now have 2 tICs tica_features[0].shape
Note: Choose the lag_time carefully. Go through the literatures to know more about lag time and set it according to your suitability.
Output
(6809, 2)
#lets dump the tica mdl for future use verbosedump(tica_mdl,"tica_mdl.pkl")
Output
Saving "tica_mdl.pkl"... (<class 'msmbuilder.decomposition.tica.tICA'>)
Time to render tICA model in a Plumed like file
import pandas as pd import mdtraj as md from msmbuilder.utils import load from msmbuilder.featurizer import ContactFeaturizer from tica_metadynamics.plumed_writer import render_tica_plumed_file #python imports import os,glob import numpy as np import pickle
#make sure that you have already fit the model using the notebooks above. tica_model = load("./tica_mdl.pkl") top = md.load("../prot0.pdb")
# swap this for whatever you have. The code for now supports contacts, dihedral, and angles. feat =ContactFeaturizer([[35,79],[215,293]], scheme='ca')
# this basically maps every feature to atom indices. df1 = pd.DataFrame(feat.describe_features(top))
print(df1)
Output
atominds featuregroup featurizer otherinfo resids resnames \ 0 [[534], [1203]] ca Contact 20 [35, 79] [ASP, VAL] 1 [[3336], [4558]] ca Contact 20 [215, 293] [ASH, LEU] resseqs 0 [36, 80] 1 [216, 294]
# to see the first 2 contacts for example print(df1.head()) output = render_tica_plumed_file(tica_mdl= tica_model, df= df1, n_tics= 2, multiple_tics=None) # output is a dictionary whose integer values are the tICs # to get the 0th tIC print(output[0]) # to get the 1st tIC print(output[1])
Output
RESTART DISTANCE ATOMS=535,1204 LABEL=ca_35_79 DISTANCE ATOMS=3337,4559 LABEL=ca_215_293 MATHEVAL ARG=ca_35_79 FUNC=x-1.53548354803 LABEL=meanfree_None_ca_35_79 PERIODIC=NO MATHEVAL ARG=ca_215_293 FUNC=x-1.49968935084 LABEL=meanfree_None_ca_215_293 PERIODIC=NO COMBINE LABEL=tic0 ARG=meanfree_None_ca_35_79,meanfree_None_ca_215_293 COEFFICIENTS=-0.138211254458,-11.6562007437 PERIODIC=NO METAD ARG=tic0 SIGMA=0.2 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=50 PRINT ARG=tic0,metad.bias STRIDE=1000 FILE=BIAS RESTART DISTANCE ATOMS=535,1204 LABEL=ca_35_79 DISTANCE ATOMS=3337,4559 LABEL=ca_215_293 MATHEVAL ARG=ca_35_79 FUNC=x-1.53548354803 LABEL=meanfree_None_ca_35_79 PERIODIC=NO MATHEVAL ARG=ca_215_293 FUNC=x-1.49968935084 LABEL=meanfree_None_ca_215_293 PERIODIC=NO COMBINE LABEL=tic1 ARG=meanfree_None_ca_35_79,meanfree_None_ca_215_293 COEFFICIENTS=-2.78923032711,-0.309608739165 PERIODIC=NO METAD ARG=tic1 SIGMA=0.2 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=50 PRINT ARG=tic1,metad.bias STRIDE=1000 FILE=BIAS
A little bit of adjustment will give you a Plumed file which you can use for Metadynamics simulations
DISTANCE ATOMS=535,1204 LABEL=d1 DISTANCE ATOMS=3337,4559 LABEL=d2 MATHEVAL ARG=d1 FUNC=x-1.53548354803 LABEL=c1 PERIODIC=NO MATHEVAL ARG=d2 FUNC=x-1.49968935084 LABEL=c2 PERIODIC=NO COMBINE LABEL=tic0 ARG=c1,c2 COEFFICIENTS=-0.138211254458,-11.6562007437 PERIODIC=NO COMBINE LABEL=tic1 ARG=c1,c2 COEFFICIENTS=-2.78923032711,-0.309608739165 PERIODIC=NO UPPER_WALLS ARG=d1,d2 AT=2.2,1.65 KAPPA=5000.0,5000.0 EXP=4,4 EPS=1,1 OFFSET=0,0 LABEL=uwall LOWER_WALLS ARG=d1,d2 AT=1.0,1.3 KAPPA=5000.0,5000.0 EXP=4,4 EPS=1,1 OFFSET=0,0 LABEL=lwall #METAD ARG=tic0,tic1 SIGMA=0.02,0.02 HEIGHT=1.0 FILE=HILLS TEMP=300 PACE=1000 LABEL=metad BIASFACTOR=15 PRINT ARG=tic0,tic1,d1,d2,c1,c2 STRIDE=1 FILE=COLVAR
Try to run plumed driver using the above input to confirm if everything is okay or not.