【python】遍历一个目录下的所有CSV并插入MONGODB

上一篇讲到把股票的历史数据下载成了CSV,查看虽然方便,但还是希望在数据库中存一份,方便存储和操作。

对新建的表还是和之前一样在mongodb compass里面插入一条记录,并建立索引
【python】遍历一个目录下的所有CSV并插入MONGODB_第1张图片
话不多说直接上代码:

'''
this file is used to import all scv under a directory into mongodb
'''
# encoding:latin1
import os
import pymongo
import csv
from pymongo import MongoClient
from pymongo import errors
import codecs


def str_to_hex(s):
    return ' '.join([hex(ord(c)).replace('0x', '') for c in s])


def analysis_file(file):
    csv_file = csv.reader(file)   # 将读入的文件转换成csv.reader对象
    head_row = next(csv_file)   #去除第一行标题

    for row in csv_file:
        history_item = dict()
        history_item["date"] = row[0]
        history_item["stock_id"] = row[1]
        history_item["stock_name"] = row[2]
        history_item["closing_price"] = row[3]
        history_item["highest_price"] = row[4]
        history_item["lowest_price"] = row[5]
        history_item["open_price"] = row[6]
        history_item["last_closing_price"] = row[7]
        history_item["rise_amount"] = row[8]
        history_item["rise_range"] = row[9]
        history_item["turn_over_rate"] = row[10]
        history_item["turn_over_hand"] = row[11]
        history_item["turn_over_amount"] = row[12]
        history_item["market_value"] = row[13]
        history_item["circulation_market_value"] = row[14]
        # print(history_item)
        try:
            collection.insert(history_item)
        except errors.DuplicateKeyError:
            pass


def read_file_list(file_path):
    file_list = os.listdir(file_path)  #读取路径下所有文件名
    for file in file_list:
        if file[-3:] != 'csv':  # 检查是不是csv文件
            continue
        path = os.path.join(file_path, file)  # 通过文件夹路径和文件名生成文件绝对路径
        print(path)
        f = codecs.open(path, 'rb', 'gbk')  # 使用适合的编码打开文件,不知道的话就多试几下
        analysis_file(f)


if __name__ == '__main__':
    client = MongoClient('127.0.0.1:27017')  #连接mongodb
    db = client['stock']  #选择库
    collection = db['stock_history']  # 选择表
    read_file_list('/stock_history/stock_history/history_data')  # 路径自定义

运行后就可以看到结果了:
【python】遍历一个目录下的所有CSV并插入MONGODB_第2张图片
数据已经顺利导入很简单吧~

你可能感兴趣的:(实战区,python)