使用pymatgen进行有序无序转变（上）

pymatgen的transformations模块提供了许多用于结构转换的方法(类)，使用这些类可以轻易的实现掺杂、建超胞、添加氧化态等功能，这里介绍一下用于有序无序转换的两个类：EnumerateStructureTransformation 和 OrderDisorderedStructureTransformation。这两个类可以实现类似supercell软件的功能，将无序结构转变为有序结构。

下面我们以立方体结构钙钛矿为例，学习一下如何使用该模块建模。的结构文件可以去 materials project 网站下载(mp-5827)。

关于有序无序转变

首先来了解一下什么是有序(order)与无序(disorder)结构，打开我们在mp上下载的cif文件：

# generated using pymatgen
data_CaTiO3
_symmetry_space_group_name_H-M   Pm-3m
_cell_length_a   3.88947141
_cell_length_b   3.88947141
_cell_length_c   3.88947141
_cell_angle_alpha   90.00000000
_cell_angle_beta   90.00000000
_cell_angle_gamma   90.00000000
_symmetry_Int_Tables_number   221
_chemical_formula_structural   CaTiO3
_chemical_formula_sum   'Ca1 Ti1 O3'
_cell_volume   58.83987623
_cell_formula_units_Z   1
loop_
 _symmetry_equiv_pos_site_id
 _symmetry_equiv_pos_as_xyz
  1  'x, y, z'
  2  '-x, -y, -z'
  3  '-y, x, z'
  4  'y, -x, -z'
  5  '-x, -y, z'
  6  'x, y, -z'
  7  'y, -x, z'
  8  '-y, x, -z'
  9  'x, -y, -z'
  10  '-x, y, z'
  11  '-y, -x, -z'
  12  'y, x, z'
  13  '-x, y, -z'
  14  'x, -y, z'
  15  'y, x, -z'
  16  '-y, -x, z'
  17  'z, x, y'
  18  '-z, -x, -y'
  19  'z, -y, x'
  20  '-z, y, -x'
  21  'z, -x, -y'
  22  '-z, x, y'
  23  'z, y, -x'
  24  '-z, -y, x'
  25  '-z, x, -y'
  26  'z, -x, y'
  27  '-z, -y, -x'
  28  'z, y, x'
  29  '-z, -x, y'
  30  'z, x, -y'
  31  '-z, y, x'
  32  'z, -y, -x'
  33  'y, z, x'
  34  '-y, -z, -x'
  35  'x, z, -y'
  36  '-x, -z, y'
  37  '-y, z, -x'
  38  'y, -z, x'
  39  '-x, z, y'
  40  'x, -z, -y'
  41  '-y, -z, x'
  42  'y, z, -x'
  43  '-x, -z, -y'
  44  'x, z, y'
  45  'y, -z, -x'
  46  '-y, z, x'
  47  'x, -z, y'
  48  '-x, z, -y'
loop_
 _atom_site_type_symbol
 _atom_site_label
 _atom_site_symmetry_multiplicity
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy
  Ca  Ca0  1  0.500000  0.500000  0.500000  1
  Ti  Ti1  1  0.000000  0.000000  0.000000  1
  O  O2  3  0.000000  0.000000  0.500000  1

内容很多，前面的内容我们只需要关注这么几个内容：

空间群 Pm-3m
化学式 CaTiO3
各原子数目 Ca1 Ti1 O3

下面我们来看一下最后几行

 _atom_site_type_symbol
 _atom_site_label
 _atom_site_symmetry_multiplicity
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy
  Ca  Ca0  1  0.500000  0.500000  0.500000  1
  Ti  Ti1  1  0.000000  0.000000  0.000000  1
  O  O2  3  0.000000  0.000000  0.500000  1

上面七行对应的是下面七列的信息，分别是：

元素符号	原子序号	原子数	x轴坐标	y轴坐标	z轴坐标	占据比
Ca	Ca0	1	0.5	0.5	0.5	1
Ti	Ti1	1	0	0	0	1
O	O2	3	0	0	0.5	1

可以看到CaTiO3中每个原子的占据比都是1，且其结构符合空间群的对称性，因此它是一个有序结构，也可以说是一个完美晶体。而无序结构顾名思义就是不是完美晶体的结构，在这里一般指的是占据比 occupancy != 1的结构。

在研究掺杂的过程中，我们会经常遇到这种情况，某个位置发生取代掺杂，部分原子被取代后占据比就会小于1。还有一种情况就是某种材料具有钙钛矿、尖晶石等结构，但是没有这种材料的结构文件，比如(LLTO)，锂镧钛氧是一种离子电导率很高的固态电解质材料，通过解析XRD数据发现它具有钙钛矿结构，其中Li和La占据A site，也就是Ca的位置，那么我们就可以以CaTiO3的结构为基础，通过取代Ca来得到 LLTO 的结构文件。

但是问题就来了，我们取代后的文件元素占据比小于1，变成了无序结构(disordered structure)，而我们使用VASP的软件进行计算的时候，原子的占据比必须是1，那我们怎么办呢？

一种通用的做法是建一个超胞(supercell)，也就是把现有的晶胞在三个方向上扩展若干倍，将分数占据比的元素按比例分配到每个位置上，得到一系列的结构后计算它们的能量，取能量最低者即为我们需要的结构。
比如我们在三个方向上都扩展三倍，得到一个 3x3x3 的超胞，此时结构化学式就变成了，对于电导率最高的：

Li : 0.33 x 27 = 8.91 9
La : 0.56 x 27 = 15.12 15
Ti : 1 x 27 = 27
O : 3x27 = 81

此时LLTO的化学式可以写作，也就是说，我们用9个Li、15个La和3个空位取代了A site上的27个Ca，此时我们得到了一个有132个原子的超胞，我们可以把它作为 LLTO 的晶胞，这个过程称之为 ordering，通过这个过程我们把一个分数占据的disordered的结构变成了一个ordered的结构。

但是问题就来了，我们取代后可以得到多少种结构呢？做个排列组合： = 17383860 x 220 = 3824449200，对于简单地取代，我们可以通过枚举将所有结构列出来，但对这种有天文数字的组合，显然我们不可能全部列出来。

对这种情况，一种可行的方法是随机取代得到若干种结构，然后通过静电能(electrostatic energy) 排序取能量最低的几个结构，再对这几个结构进行DFT计算以得到能量最低者。

两种方法

下面我们分别来介绍一下两种方法，我们以LSTF为例，即，这是一种固态电解质，具有立方体型钙钛矿结构。为了方便，我们忽略F，只考虑Li、Sr、Ta、Hf、O五种元素，化合价分别为+1、+2、+5、+4、-2。细心的同学可能会发现电荷不平衡，不过没有关系。

首先我们要设计一下结构，为了对称这里我选择建一个 3x3x3 的超胞，则该结构化学式为：

pymatgen提供了多种建超胞的方法，可以通过structure的make_supercell()方法直接建，也可以通过SupercellTransformation类返回新的结构而不改变原结构。

OrderDisorderedStructureTransformation类

我们首先来看一下如何使用OrderDisorderedStructureTransformation类进行ordering，该类位于pymatgen.transformations.standard_transformations模块中：

class OrderDisorderedStructureTransformation(algo=0, symmetrized_structures=False, no_oxi_states=False)

可以看到，该类有三个可选参数：

algo，即algorithm，排序所用算法，默认为0，可以不用管
- 0：ALGO_FAST
- 1：ALGO_COMPLETE
- 2：ALGO_BEST_FIRST
symmetrized_structures：是否为对称性结构，默认False
no_oxi_states：是否在ordering前去除氧化态，默认False，即不去除

氧化态即是否带电荷，我们的结构文件中是原子，因此不带电荷，而计算静电能需要知道每种元素的电荷，因为静电能是整体的电荷相互作用，没有电荷也就无法计算静电能了。我们可以通过Structure类的add_oxidation_state_by_element方法给元素添加氧化态，也可以通过该模块下的OxidationStateDecorationTransformation类进行添加。

属性与方法
属性：

lowest_energy_structure：具有最低能量的结构
_all_structures：存有ordering后结构的列表，结构以字典的形式储存。
inverse：没什么意义，不用管
is_one_to_many：是否返回多个结果，不用管

方法：apply_transformation(structure, return_ranked_list=False)
ordering的方法，structure为我们要排序的结构，return_ranked_list控制返回值：

False：默认值，返回Ewald能最低的结构
整数：返回该数量的结构，以字典列表的形式，每个结构储存在一个字典中，字典还包括其他信息，可以通过structure 键来访问。

代码：

import os
from pymatgen.io.cif import CifParser, CifWriter
from pymatgen.transformations.standard_transformations import SubstitutionTransformation, OrderDisorderedStructureTransformation, SupercellTransformation

# read in the CaTiO3 structure from cif file 
filename = 'CaTiO3_mp-5827_symmetrized.cif'
parser = CifParser(filename)
init_structure = parser.get_structures()[0]

# add oxidation states to the structure
data = {"Ca":2, "Ti":4, "O":-2}
init_structure.add_oxidation_state_by_element(data)

# substitute to get the partial occupancied structure
species_map = {"Ca2+":{"Li1+":0.38, "Sr2+":0.44},"Ti4+":{"Ta5+":0.7,"Hf4+":0.3}}
substitutuin = SubstitutionTransformation(species_map)
structure = substitutuin.apply_transformation(init_structure)
``
# make a 3x3x3 supercell
structure.make_supercell(3)
'''
# you can also use the SupercellTransformation class to achieve it
sc = SupercellTransformation().from_scaling_factors(3,3,3)
supercell = sc.apply_transformation(structure)
'''

# ordering，here we set return_ranked_list as 100 to get 100 structures
order = OrderDisorderedStructureTransformation()
standard_structures = order.apply_transformation(structure,return_ranked_list=100)

# save ordered structures to files
if !os.path.exists('ordering'):
    os.mkdir('ordering')
i = 1
for s in ordered_structures:
    cif = CifWriter(s['structure'])
    cif.write_file('ordering/structure_{:0>3d}.cif'.format(i))
    i+=1

当你运行这段代码的时候，你会发现程序会报错：

ValueError: Occupancy fractions not consistent with size of unit cell

这是为什么呢？我们查看一下该类的源码,找到出错的位置：

for k, v in total_occupancy.items():
    if abs(v - round(v)) > 0.25:
        raise ValueError("Occupancy fractions not consistent "
                         "with size of unit cell")

再来看看文档中对该类的说明：

simple rounding of the occupancies are performed, with no attempt made to achieve a target composition. This is usually not a problem for most ordering problems, but there can be times where rounding errors may result in structures that do not have the desired composition.

龟龟，文档中说会简单地取整，应该可以理解为四舍五入，然鹅，代码中设置了限制，超过0.25就不给算了，而对Li：0.38 x 27 = 10.26 比我们取的10大了0.26，已经超过了0.25，我们对代码做一下修改，把0.38改为0.37，此时：0.37 x 27 = 9.99 就没得问题了：

# substitute to get the partial occupancied structure
species_map = {"Ca2+":{"Li1+":0.37, "Sr2+":0.44},"Ti4+":{"Ta5+":0.7,"Hf4+":0.3}}
substitutuin = SubstitutionTransformation(species_map)
structure = substitutuin.apply_transformation(init_structure)

再次运行程序，我的电脑陷入了沉思...
看一下说明：

The algorithm can currently compute approximately 5,000,000 permutations per minute.

这个算法每分钟能处理大约5百万个结果，但是我们有上亿种结构，因此需要程序跑几分钟。当然如果你用超算或者服务器的话应该会快一点。

程序运行完后，你会发现脚本目录下多了一个 ordering 文件夹，里面有我们order后的结构，文件名形如 structure_***.cif 的形式，结构是根据ewald energy / atom由低到高排列的。

使用pymatgen进行有序无序转变（上）

关于有序无序转变

两种方法

OrderDisorderedStructureTransformation类

你可能感兴趣的:(使用pymatgen进行有序无序转变（上）)