Python浅谈-Python操作数据库之MongoDB(2)
一、NoSQL?
1.NoSQL,泛指非关系型的数据库。随着互联网web2.0网站的兴起,传统的关系数据库在处理web2.0网站,特别是超大规模和高并发的SNS类型的web2.0纯动态网站已经显得力不从心,出现了很多难以克服的问题,而非关系型的数
据库则由于其本身的特点得到了非常迅速的发展。NoSQL数据库的产生就是为了解决大规模数据集合多重数据种类带来的挑战,尤其是大数据应用难题。
常见的非关系型数据库:Hbase、Redis、MongoDB、Neo4j、NewSQL;
区别于上一篇文章的SQL(代表数据库MySQL);
“non SQL” or “non relational“: 不是SQL,非关系型——不使用表关系存储和获取数据;
Not only SQL—— 随着web2.0中大数据和实时的应用发展而来、可能支持类似SQL的查询语言;
3.NoSQL的分类:
<1>.Column: (在KV基础上的列存储,用于分布式数据):
Accumulo, Cassandra, Druid, HBase, Vertica.
<2>.Document: (文档形式存储,常见为json):
Apache CouchDB, ArangoDB, BaseX, Clusterpoint, Couchbase, Cosmos DB, IBM Domino, MarkLogic,MongoDB, OrientDB, Qizx, RethinkDB
<3>.Key-value: (KV对的模型存储数据):
Aerospike, Apache Ignite, ArangoDB, Berkeley DB, Couchbase, Dynamo, FairCom c-treeACE,FoundationDB, InfinityDB, MemcacheDB, MUMPS, Oracle NoSQL Database, OrientDB, Redis, Riak,SciDB, SDBM/Flat File
dbm,ZooKeeper
<4>.Graph: (图的结构存储,用于社交网络,交通网络等):
AllegroGraph, ArangoDB, InfiniteGraph, Apache Giraph, MarkLogic, Neo4J, OrientDB, Virtuoso
二、MongoDB简介:
1.简介:
MongoDB是一个基于分布式文件存储 的数据库。由C++语言编写。旨在为WEB应用提供可扩展的高性能数据存储解决方案;
• 索引,全文检索,查询方便;
• TTL(time to live),过期时间;
• 地理位置索引;
• 复制集,高可用+读写负载均衡;
• sharding支持水平扩展;
三、MongoDB的安装:
进入官网下载社区版本 https://www.mongodb.com/download-center/community,安装时尽量选择安装C盘(默认路径安装即可),否则可能报错。
四、MongoDB一些简单的操作:
1.MongoDB一些简单的操作:
MongoDB shell version v4.2.3 connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("ae9072fd-31c4-4078-8e10-6868f14b18a5") } MongoDB server version: 4.2.3 Server has startup warnings: 2020-03-24T16:19:10.470+0800 I CONTROL [initandlisten] 2020-03-24T16:19:10.470+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2020-03-24T16:19:10.470+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2020-03-24T16:19:10.470+0800 I CONTROL [initandlisten] --- Enable MongoDB's free cloud-based monitoring service, which will then receive and display metrics about your deployment (disk utilization, CPU, operation statistics, etc). The monitoring data will be available on a MongoDB website with a unique URL accessible to you and anyone you share the URL with. MongoDB may use this information to make product improvements and to suggest MongoDB products and deployment options to you. To enable free monitoring, run the following command: db.enableFreeMonitoring() To permanently disable this reminder, run the following command: db.disableFreeMonitoring() --- > show dbs # 查看当前数据库 ,默认的为admin ,config ,local三个数据库 admin 0.000GB catt1e 0.000GB # 自建数据库 config 0.000GB local 0.000GB > use test # 创建新的数据库 switched to db test > show dbs # 查看数据库发现并无数据 admin 0.000GB catt1e 0.000GB config 0.000GB local 0.000GB > db.test.insert({'name':'catt1e','age':'20'}) # 插入数据 WriteResult({ "nInserted" : 1 }) > show dbs # 发现数据库出现 admin 0.000GB catt1e 0.000GB config 0.000GB local 0.000GB test 0.000GB > db.dropDatabase() # 删除数据库 { "dropped" : "test", "ok" : 1 } > show dbs # 查看数据库列表,确认删除 admin 0.000GB catt1e 0.000GB config 0.000GB local 0.000GB
2.更多学习资源:
https://mubu.com/doc/3lQtP4nPDv5(个人幕布思维脑图分享部分截图)
https://docs.mongodb.com/manual/(官方文档)
https://mongoing.com/docs(中文手册,目前不知何原因暂停查看)
五、使用Python操作MongoDB数据库——pymongo:
pip install pymongo # 安装pymongo库;
以ipython为演示:
In [1]: from pymongo import MongoClient # 导入所需要的函数依赖 In [2]: client = MongoClient() # 实例化对象,默认为本地的IP In [3]: client.list_database_names() # 查看对象的数据库 Out[3]: ['admin', 'catt1e', 'config', 'local'] In [4]: db = client.newdb # 创建新的数据库 In [5]: db # 查看新的数据库 Out[5]: Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'newdb') In [6]: db.create_collection('51memo') #创建'51memo'collection Out[6]: Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'newdb'), '51memo') In [7]: db.list_collection_names() # 查看 Out[7]: ['51memo'] In [8]: collect = db['51memo'] # 实例化collection In [9]: doc = {'name':'newdb','cost_time':'30'} In [10]: collect.insert(doc) # 插入数据 A:\Anaconda\anaconda\envs\cattlelll\Scripts\ipython:1: DeprecationWarning: insert is deprecated. Use insert_one or insert_many instead. Out[10]: ObjectId('5e7ac7de58457c42aca30478') In [11]: collect.find() # 查看 Out[11]:In [12]: docs = ({'name':'newdb1','cost_time':'40'},{'name':'newdb2' ...: ,'cost_time':'60'}) In [13]: collect.insert(docs) # 插入多条数据 A:\Anaconda\anaconda\envs\cattlelll\Scripts\ipython:1: DeprecationWarning: insert is deprecated. Use insert_one or insert_many instead. Out[13]: [ObjectId('5e7ac9b158457c42aca30479'), ObjectId('5e7ac9b158457c42aca3047a')] In [15]: collect.count_documents({}) # 查看插入的文件数 ,注意加'{}' Out[15]: 3 In [16]: collect.count_documents({'name':'newdb1'}) # 查看特定的文件数 Out[16]: 1 In [19]: collect.update_one({'name':'newdb'},{'$set':{'name':'newdb3 ...: '}}) # 更新数据 Out[19]: In [21]: collect.find_one({'name':'newdb'}) # 查看数据修改是否成功 In [22]: collect.find_one({'name':'newdb3'}) Out[22]: {'_id': ObjectId('5e7ac7de58457c42aca30478'), 'name': 'newdb3', 'cost_time': '30'} In [23]: collect.delete_one({'name':'newdb1'}) # 删除数据 Out[23]: In [27]: [x for x in collect.find()] # 查看剩余数据 Out[27]: [{'_id': ObjectId('5e7ac7de58457c42aca30478'), 'name': 'newdb3', 'cost_time': '30'}, {'_id': ObjectId('5e7ac9b158457c42aca3047a'), 'name': 'newdb2', 'cost_time': '60'}] In [28]: collect.delete_one({'name':'newdb2'}) Out[28]: In [30]: [x for x in collect.find()] Out[30]: [{'_id': ObjectId('5e7ac7de58457c42aca30478'), 'name': 'newdb3', 'cost_time': '30'}]