用mongodb保存图片的两种方法

本文转载自https://blog.csdn.net/qq_23926575/article/details/79271436

godb提供的GridFS进行保存,两者各有利弊。性能方面的优劣未曾测试,无法进行评价,此处仅对两种方式进行介绍,若有彻知者还望指教。
下面以如下数据作为示例进行介绍:
数据示例

dic = {
    "owner_name" : "samssmilin",
    "photo_id" : "602880671",
    "tags" : "",
    "longitude" : "-121.106479",
    "height" : "766",
    "datetaken" : "2004-01-17 21:05:35",
    "width" : "1024",
    "length" : 38141,
    "photo_title" : "Dad and Elijah",
    "latitude" : "35.565222",
    "photo_url" : "https://farm2.staticflickr.com/1063/602880671_c2f4511ef4_b.jpg",
    "dateupload" : "1075355967",
    "owner_id" : "45365637@N00"
}123456789101112131415

一、GridFS

GridFS将图片数据与图片属性数据分开保存,用chunks来保存图片数据,files保存属性数据,一个图片file可能对应多个chunks,每个chunk的内存大小固定(16M),若图片数据大于chunk,则分为多个chunk保存,用同一个ObjectID关联,下载时自动将多个chunk合并为图片数据。
上传

from pymongo import MongoClient
from gridfs import *
import requests

client = MongoClient('127.0.0.1', 27017) #连接mongodb
db = client.photo #连接对应数据库
#db.authenticate("username","passowd")
fs = GridFS(db, collection="images") #连接collection
data = requests.get(dic["photo_url"], timeout=10).content
# 确认数据库中不存在此图片之后再保存
if not fs.find_one({"photo_url":dic["photo_url"]}):
    fs.put(data, **dic)
# 上传成功后,photo数据库下出现两个collection,分别为: images.files, images.chunks12345678910111213

下载

from pymongo import MongoClient
from gridfs import *
client = MongoClient('127.0.0.1', 27017) #连接mongodb
db = client.photo #连接对应数据库
#db.authenticate("username","passowd")
fs = GridFS(db, collection="images") #连接collection
num = 1
for grid_out in fs.find(no_cursor_timeout=True):
    data = grid_out.read() # 获取图片数据
    outf = open('/home/%d.jpg'%num,'wb')
    outf.write(data) #存储图片
    outf.close()
    if num%100000 == 0
        metadata_file = open("/home/metadata%d.csv"%(num/100000+1), "ab")
        csv_writer = csv.writer(metadata_file,delimiter='\t')
    row = [grid_out.photo_title.encode('utf-8'), grid_out.uploadDate, grid_out.upload_date, \
        grid_out.longitude, grid_out.latitude, grid_out.width, grid_out.height,\
        grid_out.owner_name.encode('utf-8'), grid_out.photo_id, grid_out._id, grid_out.photo_url]
    csv_writer.writerow(row)12345678910111213141516171819

bson二进制

这种方法将图片数据作为键值对放入字典与属性数据作为整体存入数据库中。
上传代码如下:

from bson import binary
from pymongo import MongoClient

client = MongoClient('127.0.0.1', 27017) #连接mongodb
db = client.photo #连接对应数据库
image_collection = db.images
data = requests.get(dic["photo_url"], timeout=10).content
# 确认数据库中不存在此图片之后再保存
if not image_collection.find_one({"photo_url":dic["photo_url"]})
    dic["imagecontent"] = binary.Binary(data)
    image_collection.insert(dic)1234567891011

你可能感兴趣的:(工具方法,python)