生成的分子图像是否可以识别为SMILES,然后再将识别后的SMILES转换为图像?

1、在线转换网址:DECIMER Web Application

它对应的代码地址为:GitHub - Kohulan/DECIMER_Short_Communication 

2、https://github.com/Kohulan/DECIMER-Image_Transformer

这里我用的是2,按照readme上面的做就可以了

3、Img2Mol: inferring molecules from pictures(也挺好的)

GitHub - bayer-science-for-a-better-life/Img2Mol

这里的3比2好用,但是3最好使用local-cddd,因为不使用本地的,还需要联网,可能会出现很多错误,

conda env create -f environment.local-cddd.yml
conda activate img2mol
pip install .

然后:

If you are working with the local CDDD installation, please * download and unzip the CDDD model and 将 directory default_model to path/to/anaconda3/envs/img2mol/lib/python3.6/site-packages/cddd/data/

生成的分子图像是否可以识别为SMILES,然后再将识别后的SMILES转换为图像?_第1张图片

 

 


先看下文件目录:

生成的分子图像是否可以识别为SMILES,然后再将识别后的SMILES转换为图像?_第2张图片

这里需要注意一点,我这里将这个文件夹当成了子文件夹放入了文件中,所以可能会引起import的错误,所以这里需要右键文件夹,然后,make directory as Source root

直接上代码:

image2smiles2image.py的代码:

其中input_images是生成器G生成的image;
image2smiles_all.csv是是G生成的image然后转为所有的的smiles
image2smiles_validity.csv是是G生成的image然后转为有效的smiles
image2smiles_unvalidity.csv是是G生成的image然后转为无效的的smiles
image2smiles2image:是G生成的image然后转为smiles再转为image的文件夹

command:

# python image2smiles2image.py --input_images_path ../../eval_output_images/QM9/generator_images --image2smiles2image_save_path ../../eval_output_images/QM9/image2smiles2image/ --image2smiles_all ../../eval_output_images/QM9/image2smiles_all.csv --image2smiles_validity ../../eval_output_images/QM9/image2smiles_validity.csv --image2smiles_unvalidity ../../eval_output_images/QM9/image2smiles_unvalidity.csv

code:


from DECIMER import predict_SMILES

import glob
import os
import csv
from rdkit import Chem
from rdkit.Chem.Draw import rdMolDraw2D
import argparse

# Get all png files under the input folder
parser = argparse.ArgumentParser(description='Testing script', add_help=False)
parser.add_argument('--input_images_path', default='../../eval_output_images/QM9/generator_images', help='Input images folder')
parser.add_argument('--image2smiles2image_save_path', default='../../eval_output_images/QM9/image2smiles2image/')
parser.add_argument('--image2smiles_all', default='../../eval_output_images/QM9/image2smiles_all.csv')
parser.add_argument('--image2smiles_validity', default='../../eval_output_images/QM9/image2smiles_validity.csv')
parser.add_argument('--image2smiles_unvalidity', default='../../eval_output_images/QM9/image2smiles_unvalidity.csv')
args = parser.parse_args()


input_img_path = glob.glob(args.input_images_path + "/*.[jp][pn]g")
image2smiles2image_save_path = args.image2smiles2image_save_path

def mkdir(path):
    folder = os.path.exists(path)
    if not folder:  # 判断是否存在文件夹如果不存在则创建为文件夹
        os.makedirs(path)  # makedirs 创建文件时如果路径不存在会创建这个路径
        print("--- create new folder...  ---")
    else:
        print("---  There is this folder!  ---")
mkdir(image2smiles2image_save_path)

i = 0
unrecover_images = 0
for file in input_img_path:
    # 在windows下使用“\\”,在linux下使用“/”,注意切换
    file_name = file.split('\\')[-1]
    ### 下面为处理图片的过程

    SMILES = predict_SMILES(file)
    i = i + 1
    print("The current process image ", i ," is :", file, ", And the SMILES is :", SMILES)

    #----------------------- 1、save the Validity image2SMILES ------------------------------------
    # 1. 创建文件对象
    f_validity = open(args.image2smiles_all, 'a', newline='', encoding='utf-8')
    # 2. 基于文件对象构建 csv写入对象
    csv_writer_validity = csv.writer(f_validity)
    # ----------------------- 1.2、save the All image2SMILES ------------------------------------
    f_all = open(args.image2smiles_validity, 'a', newline='', encoding='utf-8')
    csv_writer_all = csv.writer(f_all)
    csv_writer_all.writerow([SMILES])
    f_all.close()

    # ----------------------- 2、save the image2SMILES2image ------------------------------
    try:
        mol = Chem.MolFromSmiles(SMILES)
        drawer = rdMolDraw2D.MolDraw2DCairo(256, 256)
        opts = rdMolDraw2D.MolDrawOptions()
        # 设置杆的粗细
        opts.bondLineWidth = 3
        # 设置字母的大小
        opts.minFontSize = 15
        drawer.SetDrawOptions(opts)
        rdMolDraw2D.PrepareAndDrawMolecule(drawer, mol)
        drawer.FinishDrawing()
        img_save_path = str(image2smiles2image_save_path + file_name)
        drawer.WriteDrawingText(img_save_path)

        # 4. 写入csv文件内容
        csv_writer_validity.writerow([SMILES])
        # 5. 关闭文件
        f_validity.close()


    # 不能从SMILES转换为分子结构图片
    except Exception as e:
        unrecover_images = unrecover_images + 1
        print("The current process image ", i ," is :", file, ", And the SMILES is :", SMILES, " is not un-validity !")
        # ----------------------- 2.2、save the image2SMILES_wrong ------------------------------------
        f_unvalidity = open(args.image2smiles_unvalidity, 'a', newline='', encoding='utf-8')
        csv_writer_unvalidity = csv.writer(f_unvalidity)
        write_in = ("The un-rational image is :", file, ", And the SMILES is :", SMILES)
        csv_writer_unvalidity.writerow([write_in])
        f_unvalidity.close()

print("The totel images is :", i ," , And the unrecover_images is :", unrecover_images, ", And the success rate is:", (i-unrecover_images)/i)


你可能感兴趣的:(AIDrug,生成对抗网络,深度学习,计算机视觉,人工智能)