MongoDB:22-MongoDB-GridFS

GridFS是MongoDB规范用于存储和检索大文件,如图片,音频文件,视频文件等。
这是一种文件系统用来存储文件,但数据存储于MongoDB集合中。GridFS存储文件比其文档大小16MB限制的更大能力。

使用GridFS的理由
     
     
     
     
  1. 理由如下:
  2. 1)存储用户产生的文件内容
    • 大多数Web应用都允许用户上传文件。当用户使用关系数据库时,这些用户产生的文件会存储在文件系统中,与数据库相隔离,而不是放在数据库内。
    • 这就带来了一些问题。如何将文件复制到所有需要文件的服务器上?当文件删除后,怎样删除所有的拷贝?怎样保障文件的安全以及做灾备呢?
    • GridFS很好地解决了这些问题,你可以利用你的数据库备份来备份你的文件。
    • 而且由于MongoDB自身的复制技术,在MongoDB集群中的每一个副本处都有你的文件拷贝。删除文件跟删除数据库中的对象一样简单。
  3. 2)访问文件内容的分区
    • 当把文件上传到GridFS后,文件会被分割成大小为256KB的块,并单独存放。
    • 因此当你需要读文件中的某个范围的字节时,只需把相应的文件块载入内存,而无需把整个文件加载到内存。
    • 这一点对于选择读或编辑尺寸很大的媒体内容文件时非常有用。
  4. 3)在MongoDB中存储16MB以上的文件
    • MongoDB默认的文件大小上限为16MB
    • 所以,如果你的文件超过了16MB,那么你就应该使用GridFS
  5. 4)克服文件系统的限制
    • 如果你需要存储大量的文件,你就需要考虑文件系统自身的限制,因为文件系统对目录下的文件数量是有要求的。
    • 而使用GridFS后,你无需再担心这个问题。
    • GridFSMongoDB的分片使得你的文件可以分布到多个服务器上,而且没有增加操作的复杂性

GridFS的特点
      
      
      
      
  1. GridFS 用于存储和恢复那些超过16MBSON文件限制)的文件(如:图片、音频、视频等)。
  2. GridFS 也是文件存储的一种方式,但是它是存储在MonoDB的集合中
  3. GridFS 可以更好的存储大于16M的文件。
  4. GridFS 会将大文件对象分割成多个小的chunk(文件片段),一般为256k/个,每个chunk将作为MongoDB的一个文档(document)被存储在chunks集合中。
  5. GridFS 用两个集合来存储一个文件:fs.filesfs.chunks
  6.           
              
              
              
    1. 和文件有关的meta数据(filename,content_type,还有用户自定义的属性)将会被存在files集合中。
                    
                    
                    
                    
      1. MongoDB还在files_id和文件块数中创建了复合索引,以帮助快速访问这些文件块
               
               
               
               
    1. 每个文件的实际内容被存在chunks(二进制数据)中,fs.chunks集合则存储实际的以256KB尺寸进行分割的文件块
                     
                     
                     
                     
      1. 如果你有分片的集合,那么文件块会分布到多台服务器上,或许能获得比文件系统更好的性能

GridFS的模块
      
      
      
      
  1. 如果你想把存储在MongoDBGridFS的文件直接服务于Web服务器或文件系统,那么你可以使用下面的GridFS插件:
  2. 1GridFS-Fuse:让GridFS的文件直接服务于文件系统
  3. 2GridFS-Nginx:让GridFS的文件直接服务于Nginx


GridFS的局限性
       
       
       
       
  1. GridFS也并非十全十美的,它也有一些局限性:
  2. 1)工作集
  3. 伴随数据库内容的GridFS文件会显著地搅动MongoDB的内存工作集。如果你不想让GridFS的文件影响到你的内存工作集,
  4. 那么可以把GridFS的文件存储到不同的MongoDB服务器上。
  5. 2)性能
  6. 文件服务性能会慢于从Web服务器或文件系统中提供本地文件服务的性能。但是这个性能的损失换来的是管理上的优势
  7. 3)原子更新
  8. GridFS没有提供对文件的原子更新方式。如果你需要满足这种需求,那么你需要维护文件的多个版本,并选择正确的版本

GridFS--添加文件

现在我们使用 GridFS 的 put 命令来存储 mp3 文件。 调用 MongoDB 安装目录下bin的 mongofiles.exe工具。

打开命令提示符,进入到MongoDB的安装目录的bin目录中,找到mongofiles.exe,并输入下面的代码:

      
      
      
      
  1. mongofiles -d mongotest -l C:\Users\Administrator\Desktop\一个人.mp3 put 一个人.mp3
  2. 或者
  3. mongofiles -d mongotest put C:\Users\Administrator\Desktop\一个人.mp3

使用以下命令来查看数据库中文件的文档:

       
       
       
       
  1. db.fs.files.find()

以上命令执行后返回以下文档数据:

     
     
     
     
  1. /* 1 */
  2. {
  3. "_id" : ObjectId("59f857e041381c2e68d6f6f1"),
  4. "chunkSize" : 261120,
  5. "uploadDate" : ISODate("2017-10-31T11:00:49.059Z"),
  6. "length" : 10765942,
  7. "md5" : "59c05eac8c006386236af5b24635e6d6",
  8. "filename" : "一个人.mp3"
  9. }
             
             
             
             
    1. 元数据结构:
    2. _id 文件的唯一id,在块中作为files_id键值存储
    3. length 文件内容总的字节数
    4. chunkSize 每块的大小(字节),默认是256K,必要时可调整
    5. uploadDate文件存入GridFS的时间戳
    6. md5 文件内容的md5的校验和,由服务器端生成。

我们可以看到 fs.chunks 集合中所有的区块,以下我们得到了文件的 _id 值,我们可以根据这个 _id 获取区块(chunk)的数据:
      
      
      
      
  1. db.fs.chunks.find({files_id:ObjectId('59f857e041381c2e68d6f6f1')})
以上实例中,查询返回了 42 个文档的数据,意味着mp3文件被存储在42个区块中。

MongoDB:22-MongoDB-GridFS_第1张图片
        
        
        
        
  1. 块集合的文档结构如下:
  2. _id:块的唯一ID
  3. files_id:包含这个块元数据的文件文档的id
  4. n:表示块编号,也就是这个块在原文件中顺序编号
  5. data:包含组成文件块的二进制数据


 附录--mongofiles 命令使用

All mongofiles commands have the following form:

mongofiles   

The components of the mongofiles command are:

  1. Options. You may use one or more of these options to control the behavior of mongofiles.
  2. Commands. Use one of these commands to determine the action of mongofiles.
  3. A filename which is either: the name of a file on your local’s file system, or a GridFS object.

Run mongofiles from the system command line, not the mongo shell.

IMPORTANT

For replica sets, mongofiles can only read from the set’s primary.

Required Access

In order to connect to a mongod that enforces authorization with the --auth option, you must use the --username and --password options. 

The connecting user must possess, at a minimum:

  • the read role for the accessed database when using the listsearch or get commands,
  • the readWrite role for the accessed database when using the put or delete commands.

Options

Changed in version 3.0.0: mongofiles removed the --dbpath as well as related --directoryperdb and --journal options. 

To use mongofiles, you must run mongofiles against a running mongod or mongos instance as appropriate.

mongofiles
--help

Returns information on the options and use of mongofiles.

--verbose【详细信息-v

Increases the amount of internal reporting returned on standard output or in log files. 

Increase the verbosity with the -v form by including the option multiple times, (e.g. -vvvvv.)

--quiet

Runs mongofiles in a quiet mode that attempts to limit the amount of output.

This option suppresses:

  • output from database commands
  • replication activity
  • connection accepted events
  • connection closed events
--version

Returns the mongofiles release number.

--host <:port>

Specifies a resolvable hostname for the mongod that holds your GridFS system. By default mongofiles attempts to connect to a MongoDB process running on the localhost port number 27017.

Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.

--port 

Default: 27017

Specifies the TCP port on which the MongoDB instance listens for client connections.

--ipv6

Removed in version 3.0.

Enables IPv6 support and allows mongofiles to connect to the MongoDB instance using an IPv6 network. Prior to MongoDB 3.0, you had to specify --ipv6 to use IPv6. In MongoDB 3.0 and later, IPv6 is always enabled.

--ssl

New in version 2.6.

Enables connection to a mongod or mongos that has TLS/SSL support enabled.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

--sslCAFile 

New in version 2.6.

Specifies the .pem file that contains the root certificate chain from the Certificate Authority. Specify the file name of the .pem file using relative or absolute paths.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

WARNING

Version 3.2 and earlier: For SSL connections (--ssl) to mongod and mongos, if the mongofilesruns without the --sslCAFilemongofiles will not attempt to validate the server certificates. This creates a vulnerability to expired mongod and mongos certificates as well as to foreign processes posing as valid mongod or mongos instances. Ensure that you always specify the CA file to validate the server certificates in cases where intrusion is a possibility.

--sslPEMKeyFile 

New in version 2.6.

Specifies the .pem file that contains both the TLS/SSL certificate and key. Specify the file name of the .pem file using relative or absolute paths.

This option is required when using the --ssl option to connect to a mongod or mongos that hasCAFile enabled without allowConnectionsWithoutCertificates.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

- -sslPEMKeyPassword 

New in version 2.6.

Specifies the password to de-crypt the certificate-key file (i.e. --sslPEMKeyFile). Use the --sslPEMKeyPassword option only if the certificate-key file is encrypted. In all cases, the mongofileswill  redact the password from all logging and reporting output.

If the private key in the PEM file is encrypted and you do not specify the --sslPEMKeyPassword option, the mongofiles will prompt for a passphrase. See SSL Certificate Passphrase.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

--sslCRLFile 

New in version 2.6.

Specifies the .pem file that contains the Certificate Revocation List. Specify the file name of the .pem file using relative or absolute paths.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

--sslAllowInvalidCertificates

New in version 2.6.

Bypasses the validation checks for server certificates and allows the use of invalid certificates. When using the allowInvalidCertificates setting, MongoDB logs as a warning the use of the invalid certificate.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

--sslAllowInvalidHostnames

New in version 3.0.

Disables the validation of the hostnames in TLS/SSL certificates. Allows mongofiles to connect to MongoDB instances even if the hostname in their certificates do not match the specified hostname.

Changed in version 3.0: Most MongoDB distributions include support for TLS/SSL. See Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients for more information about TLS/SSL and MongoDB.

Changed in version 3.4: If --sslCAFile is not specified when connecting to an TLS/SSL-enabled server, the system-wide CA certificate store will be used.

--sslFIPSMode

New in version 2.6.

Directs the mongofiles to use the FIPS mode of the installed OpenSSL library. Your system must have a FIPS compliant OpenSSL library to use the --sslFIPSMode option.

NOTE

FIPS-compatible SSL is available only in MongoDB Enterprise. See Configure MongoDB for FIPS for more information.

--username -u 

Specifies a username with which to authenticate to a MongoDB database that uses authentication. Use in conjunction with the --password and --authenticationDatabase options.

--password -p 

Specifies a password with which to authenticate to a MongoDB database that uses authentication. Use in conjunction with the --username and --authenticationDatabase options.

Changed in version 3.0.0: If you do not specify an argument for --passwordmongofiles returns an error.

Changed in version 3.0.2: If you wish mongofiles to prompt the user for the password, pass the --username option without --password or specify an empty string as the --password value, as in --password "" .

--authenticationDatabase 

Specifies the database in which the user is created. See Authentication Database.

--authenticationMechanism 

Default: SCRAM-SHA-1

Changed in version 2.6: Added support for the PLAIN and MONGODB-X509 authentication mechanisms.

Changed in version 3.0: Added support for the SCRAM-SHA-1 authentication mechanism. Changed default mechanism to SCRAM-SHA-1.

Specifies the authentication mechanism the mongofiles instance uses to authenticate to the mongod or mongos.

Value Description
SCRAM-SHA-1 RFC 5802 standard Salted Challenge Response Authentication Mechanism using the SHA1 hash function.
MONGODB-CR MongoDB challenge/response authentication.
MONGODB-X509 MongoDB TLS/SSL certificate authentication.
GSSAPI (Kerberos) External authentication using Kerberos. This mechanism is available only in MongoDB Enterprise.
PLAIN (LDAP SASL) External authentication using LDAP. You can also use PLAIN for authenticating in-database users. PLAIN transmits passwords in plain text. This mechanism is available only in MongoDB Enterprise.
--gssapiServiceName

New in version 2.6.

Specify the name of the service using GSSAPI/Kerberos. Only required if the service does not use the default name of mongodb.

This option is available only in MongoDB Enterprise.

--gssapiHostName

New in version 2.6.

Specify the hostname of a service using GSSAPI/Kerberos. Only required if the hostname of a machine does not match the hostname resolved by DNS.

This option is available only in MongoDB Enterprise.

--db -d 

Specifies the name of the database on which to run the mongofiles.

--collection -c 

This option has no use in this context and a future release may remove it. See SERVER-4931 for more information.

--local -l 

Specifies the local filesystem name of a file for get and put operations.

In the mongofiles put and mongofiles get commands, the required  modifier refers to the name the object will have in GridFS. mongofiles assumes that this reflects the file’s name on the local file system. This setting overrides this default.

--type 

Provides the ability to specify a MIME type to describe the file inserted into GridFS storage. mongofiles omits this option in the default operation.

Use only with mongofiles put operations.

--replace-r

Alters the behavior of mongofiles put to replace existing GridFS objects with the specified local file, rather than adding an additional object with the same name.

In the default operation, files will not be overwritten by a mongofiles put option.

--prefix string

Default: fs

GridFS prefix to use.

--writeConcern 

Default: majority

Specifies the write concern for each write operation that mongofiles writes to the target database.

Specify the write concern as a document with w options.

Commands

list

Lists the files in the GridFS store. The characters specified after list (e.g. ) optionally limit the list of returned items to files that begin with that string of characters.

search

Lists the files in the GridFS store with names that match any portion of .

put

Copy the specified file from the local file system into GridFS storage.

Here,  refers to the name the object will have in GridFS, and mongofiles assumes that this reflects the name the file has on the local file system. If the local filename is different use the mongofiles --local option.

get

Copy the specified file from GridFS storage to the local file system.

Here,  refers to the name the object will have in GridFS. mongofiles writes the file to the local file system using the file’s filename in GridFS. To choose a different location for the file on the local file system, use the --local option.

get_id ""

New in version 3.2.0.

Copy the specified file from GridFS storage to the local file system.

Here  refers to the extended JSON _id of the object in GridFS. mongofiles writes the file to the local file system using the file’s filename in GridFS. To choose a different location for the file on the local file system, use the --local option.

delete

Delete the specified file from GridFS storage.

delete_id ""

New in version 3.2.0.

Delete the specified file from GridFS storage. Specify the file using its _id.

Examples

To return a list of all files in a GridFS collection in the records database, use the following invocation at the system shell:

mongofiles -d records list

This mongofiles instance will connect to the mongod instance running on the 27017 localhost interface to specify the same operation on a different port or hostname, and issue a command that resembles one of the following:

mongofiles --port 37017 -d records list
mongofiles --host db1.example.net -d records list
mongofiles --host db1.example.net --port 37017 -d records list

Modify any of the following commands as needed if you’re connecting the mongod instances on different ports or hosts.

To upload a file named 32-corinth.lp to the GridFS collection in the records database, you can use the following command:

mongofiles -d records put 32-corinth.lp

To delete the 32-corinth.lp file from this GridFS collection in the records database, you can use the following command:

mongofiles -d records delete 32-corinth.lp

To search for files in the GridFS collection in the records database that have the string corinth in their names, you can use following command:

mongofiles -d records search corinth

To list all files in the GridFS collection in the records database that begin with the string 32, you can use the following command:

mongofiles -d records list 32

To fetch the file from the GridFS collection in the records database named 32-corinth.lp, you can use the following command:

mongofiles -d records get 32-corinth.lp

To fetch the file from the GridFS collection in the records database with _id:ObjectId("56feac751f417d0357e7140f"), you can use the following command:

mongofiles -d records get_id 'ObjectId("56feac751f417d0357e7140f")'
参考来源:  https://docs.mongodb.com/manual/reference/program/mongofiles/
参考来源:http://www.runoob.com/mongodb/mongodb-gridfs.html

你可能感兴趣的:(MongoDB,MongoDB-从基础到深入)