Autodock vina 和MGL tools linux版本操作(命令行操作)

总结:

step1. MGLtools

蛋白质Edit——加polar H,加Kollman电荷后,Grid——Macromolecule——choose,自动保存pro1.pdbqt

配体 Ligand——Input——QuickSetup,自动保存.out.pdbqt

step2. Autodock Vina

写参数文件config1.txt

receptor = clusters_0001_model1.pdbqt

ligand = hypericin.out.pdbqt

center_x = 2.148 (第一个文件可以由Docking——output——Vina生成)

center_y = 3.704

center_z = -2.81

size_x = 74.55   (大于30*30*30,要增加exhaustiveness)

size_y = 74.55

size_z = 74.55

out = cluster1_hypericin_out.pdbqt

log = cluster1_hypericin_out.log

exhaustiveness=24(可以多少个CPU设置多少个,并行计算,exhaustiveness控制一个docking过程重复计算多少次,越高花时间越长,但也不要设置特别大,也没有意义,要在效率和准确度找一个平衡点)

num_modes=30 (不一定输出30个,可能只找到<30构象,也有可能被energy_range限制 )

energy_range=6 (kcal/mol)

(柔性残基信息:flex=side_chains.pdbqt)

运行:

vina --config config1.txt

结果分析:

用pymol打开cluster1_hypericin_out.pdbqt看结果


以下为详细设置和操作:

1. 首先用MGLtools准备蛋白和小分子的坐标文件(pdbqt)

蛋白质: 加氢(Add all hydrogens or just non-polar hydrogens. Merge non-polar hydrogens and their charges with their parent carbon atom)、计算电荷(Assign partial atomic charges to the ligand and the macromolecule (Gasteiger or Kollman United Atom charges))、添加原子类型、柔性残基信息

1. Edit——加polar H,merge nonpolar(自动)后,pdbqt中的原子数会改变,加H的时候,可能会重新编号

2.Edit——加Kollman United Atom charges电荷(只能加这个,还可以计算Gasteiger 电荷,其值更负),加电荷后,pdbqt中的电荷会改变

配体分子: 加氢、计算电荷、确定root(扭矩中心),选择可旋转的键 (Set up rotatable bonds in the ligand using a graphical version of AutoTors)。我直接使用的quick setup。

Tips: H的位置是任意的,仅取决于输入文件;电荷AutoDock Vina ignores the user-supplied partial charges. It has its own way of dealing with the electrostatic interactions through the hydrophobic and the hydrogen bonding terms. 

保存pdbqt文件:pro1.pdbqt,pro2.pdbqt,pro3.pdbqt,lig1.pdbqt,lig2.pdbqt,lig3.pdbqt


原始pdbqt


加H后的,原子序号增加,电荷不变


加电荷后,电荷改变

2. 写参数文件 config.txt

eg. config.txt

receptor = clusters_0001_model1.pdbqt

ligand = hypericin.out.pdbqt

center_x = 2.148

center_y = 3.704

center_z = -2.81

size_x = 74.55   (大于30*30*30,要增加exhaustiveness)

size_y = 74.55

size_z = 74.55

out = cluster1_hypericin_out.pdbqt

log = cluster1_hypericin_out.log

exhaustiveness=15

num_modes=30 (不一定输出30个,可能只找到<30构象,也有可能被energy_range限制 )

energy_range=6



按上述config的结果,发现能量相差很小,我设置了30个,但是只输出20个有效构象

Details:

Input:

  --receptor arg        rigid part of the receptor (PDBQT)

  --flex arg            flexible side chains, if any (PDBQT)

  --ligand arg          ligand (PDBQT)

Search space (required): 搜索空间有效地限制了包括柔性侧链在内的可移动原子的位置。

How big should the search space be?

As small as possible, but not smaller. The smaller the search space, the easier it is for the docking algorithm to explore it. On the other hand, it will not explore ligand and flexible side chain atom positions outside the search space. You should probably avoid search spaces bigger than 30 x 30 x 30 Angstrom, unless you also increase "--exhaustiveness".)

  --center_x arg        X coordinate of the center

  --center_y arg        Y coordinate of the center

  --center_z arg        Z coordinate of the center

  --size_x arg          size in the X dimension (Angstroms)

  --size_y arg          size in the Y dimension (Angstroms)

  --size_z arg          size in the Z dimension (Angstroms)

Output (optional):

  --out arg            output models (PDBQT), the default is chosen based on

                        the ligand file name

  --log arg            optionally, write log file

Misc (optional):

  --cpu arg                the number of CPUs to use (the default is to try to detect the number of CPUs or, failing that, use 1)

  --seed arg                explicit random seed

  --exhaustiveness arg (=8) exhaustiveness of the global search (roughly proportional to time): 1+  //使用默认的(或任何给定的)穷尽性设置,用于搜索的时间已经根据原子的数量、flexibility等自发变化。通常情况下,花费额外的时间搜索来降低找不到评分函数的全局最小值的概率是没有意义的,这个概率远远低于该最小值远离本机构象的概率。然而,如果你觉得在exhaustiveness和时间之间的自动平衡是不够的,你可以提高exhaustiveness的数值。这将线性地增加时间,并降低不找到最小值的概率。

  --num_modes arg (=9)      maximum number of binding modes to generate //改成30

  --energy_range arg (=3)  maximum energy difference between the best binding //改成 8

                            mode and the worst one displayed (kcal/mol)

Configuration file (optional):

  --config arg          the above options can be put here

Information (optional):

  --help                display usage summary

  --help_advanced      display usage summary with advanced options

  --version            display program version

Output:

1. Energy

The predicted binding affinity is in kcal/mol.

2. RMSD

RMSD values are calculated relative to the best mode and use only movable heavy atoms. Two variants of RMSD metrics are provided, rmsd/lb (RMSD lower bound) and rmsd/ub (RMSD upper bound), differing in how the atoms are matched in the distance calculation:

rmsd/ub matches each atom in one conformation with itself in the other conformation, ignoring any symmetry

rmsd' matches each atom in one conformation with the closest atom of the same element type in the other conformation (rmsd' can not be used directly, because it is not symmetric)

rmsd/lb is defined as follows: rmsd/lb(c1, c2) = max(rmsd'(c1, c2), rmsd'(c2, c1))

3. Hydrogen positions

Vina uses a united-atom scoring function. As in AutoDock, polar hydrogens are needed in the input structures to correctly type heavy atoms as hydrogen bond donors. However, in Vina, the degrees of freedom that only move hydrogens, such as the hydroxyl group torsions, are degenerate. Therefore, in the output, some hydrogen atoms can be expected to be positioned randomly (but consistent with the covalent structure). For a united-atom treatment, this is essentially a cosmetic issue.

4. Separate models 用vina_split分割成多个pdbqt

All predicted binding modes, including the positions of the flexible side chains are placed into one multimodel PDBQT file specified by the "out" parameter or chosen by default, based on the ligand file name. If needed, this file can be split into individual models using a separate program called "vina_split", included in the distribution.

注意:vina_split 的Windows版本要用cmd来实现,找到vina_split所在的目录,运行vina_split --input **.pdbqt




1. Why am I seeing a warning about the search space volume being over 27000 Angstrom^3?

This is probably because you intended to specify the search space sizes in "grid points" (0.375 Angstrom), as in AutoDock 4. The AutoDock Vina search space sizes are given in Angstroms instead. If you really intended to use an unusually large search space, you can ignore this warning, but note that the search algorithm's job may be harder. You may need to increase the value of the exhaustiveness to make up for it. This will lead to longer run time.

2. The bound conformation looks reasonable, except for the hydrogens. Why?

AutoDock Vina actually uses a united-atom scoring function, i.e. one that involves only the heavy atoms. Therefore, the positions of the hydrogens in the output are arbitrary. The hydrogens in the input file are used to decide which atoms can be hydrogen bond donors or acceptors though, so the correct protonation of the input structures is still important.

3. What does "exhaustiveness" really control, under the hood? (exhaustiveness为the number of runs,并行,可以设为cpu数,可以充分利用)

In the current implementation, the docking calculation consists of a number of independent runs, starting from random conformations. Each of these runs consists of a number of sequential steps. Each step involves a random perturbation of the conformation followed by a local optimization (using the Broyden-Fletcher-Goldfarb-Shanno algorithm) and a selection in which the step is either accepted or not. Each local optimization involves many evaluations of the scoring function as well as its derivatives in the position-orientation-torsions coordinates. The number of evaluations in a local optimization is guided by convergence and other criteria. The number of steps in a run is determined heuristically, depending on the size and flexibility of the ligand and the flexible side chains. However, the number of runs is set by the exhaustiveness parameter. Since the individual runs are executed in parallel, where appropriate, exhaustiveness also limits the parallelism. Unlike in AutoDock 4, in AutoDock Vina, each run can produce several results: promising intermediate results are remembered. These are merged, refined, clustered and sorted automatically to produce the final result.

4. Why do I not get the correct bound conformation?

It can be any of a number of things:

If you are coming from AutoDock 4, a very common mistake is to specify the search space in "points" (0.375 Angstrom), instead of Angstroms.

Your ligand or receptor might not have been correctly protonated. (初始结构没有优化好)

Bad luck (the search algorithm could have found the correct conformation with good probability, but was simply unlucky). Try again with a different seed.

The minimum of the scoring function correponds to the correct conformation, but the search algorithm has trouble finding it. In this case, higher exhaustiveness or smaller search space should help. (搜索算法没有找到最优结构,可以增大exhaustiveness或减小search space)

The minimum of the scoring function simply is not where the correct conformation is. Trying over and over again will not help, but may occasionally give the right answer if two wrongs (inexact search and scoring) make a right. Docking is an approximate approach.

Related to the above, the culprit may also be the quality of the X-ray or NMR receptor structure.

If you are not doing redocking, i.e. using the correct induced fit shape of the receptor, perhaps the induced fit effects are large enough to affect the outcome of the docking experiment.(换受体结构)

The rings can only be rigid during docking. Perhaps they have the wrong conformation, affecting the outcome.

You are using a 2D (flat) ligand as input.

The actual bound conformation of the ligand may occasionally be different from what the X-ray or NMR structure shows.

Other problems

5. How can I tweak the scoring function?

You can change the weights easily, by specifying them in the configuration file, or in the command line. For example

vina --weight_hydrogen -1.2 ...

doubles the strenth of all hydrogen bonds.

我如何调整评分功能?

通过在配置文件或命令行中指定权重,可以轻松地更改权重。例如

vina—weight_hydrogen -1.2…

所有氢键强度的两倍。

Functionality that would allow the users to create new atom and pseudo-atom types, and specify their own interaction functions is planned for the future.

This should make it easier to adapt the scoring function to specific targets, model covalent docking and macro-cycle flexibility, experiment with new scoring functions, and, using pseudo-atoms, create directional interaction models.

6. Why don't I get as many binding modes as I specify with "--num_modes"?

This option specifies the maximum number of binding modes to output. The docking algorithm may find fewer "interesting" binding modes internally. The number of binding modes in the output is also limited by the "energy_range", which you may want to increase.

7. Why don't the results change when I change the partial charges?

AutoDock Vina ignores the user-supplied partial charges. It has its own way of dealing with the electrostatic interactions through the hydrophobic and the hydrogen bonding terms. See the original publication [*] for details of the scoring function.

8. How do I use flexible side chains?

You split the receptor into two parts: rigid and flexible, with the latter represented somewhat similarly to how the ligand is represented. See the section "Flexible Receptor PDBQT Files" of the AutoDock4.2 User Guide (page 14) for how to do this in AutoDock Tools. Then, you can issue this command: vina --config conf --receptor rigid.pdbqt --flex side_chains.pdbqt --ligand ligand.pdbqt. Also see this write-up on this subject.

你可能感兴趣的:(Autodock vina 和MGL tools linux版本操作(命令行操作))