#001 Python整理MYSQL的XML数据字典

一、业务背景

MYSQL DB导出的数据字典是XML的,我们需要整理成公司的EXCEL标准,提交给数据中心留档

二、我们用Python来实现XML转表格

2.1、先来看看MYSQL导出的数据字典的XML样式:



    
        
            
                
            
            
                
            
            
                
            
        

2.2、具体的思路和步骤

1.先读取XML文件
2.根据XML的内容,用DOM的遍历办法把相关的表、字段和字段的属性读取到ROW
3.把ROW导入到DF
4.DF导出CSV或者Excel

最后来看看代码

# -*- coding: utf-8 -*-
"""
    @Author  : Nick
    @Time    : 2023/8/29
    @Comment : 
"""
from xml.dom.minidom import parse
import pandas as pd

def readXML():
    xtree = parse("./p360-mysql.xml")
    # 文档根元素
    xroot = xtree.documentElement
    print(xroot.nodeName)

    df_cols = ["table","table_cn","id", "type", "length", "decimal", "jt","mandatory","column_cn"]
    rows = []

    # 所有Table
    tables = xroot.getElementsByTagName("table")

    for table in tables:
        if table.hasAttribute("name"):
            t_id = table.getAttribute("name")
            t_id_cn = table.getElementsByTagName("comment")[0].firstChild.data

            # 根据元素名查找
            columns = table.getElementsByTagName("column")
            # 遍历
            for column in columns:
                id = column.getAttribute('name')
                type = column.getAttribute('type')
                length = column.getAttribute('length')
                decimal = column.getAttribute('decimal')
                jt = column.getAttribute('jt')
                mandatory = column.getAttribute('mandatory')

                comments = column.getElementsByTagName("comment")
                column_cn = ""
                for comment in comments:
                    column_cn = comment.firstChild.data
                
                rows.append({"table":t_id,"table_cn":t_id_cn,"id": id, "type": type, "length": length, "decimal": decimal, "jt": jt, "mandatory": mandatory, "column_cn": column_cn})

    out_df = pd.DataFrame(rows, columns = df_cols)
    out_df.to_csv('p360.csv')

if __name__ == '__main__':
    readXML()


2.3 导出后的效果

1693379527693.jpg

注意:导出的csv是utf-8, Excel打开会中文乱码,可以修改文件的编码为ANSII才可以

你可能感兴趣的:(Python应用100例,python,xml,mysql)