软件 | 版本 | 压缩包包名 |
---|---|---|
seaweedfs | seaweedfs-1.11 | linux_amd64.tar.gz |
https://github.com/chrislusf/seaweedfs
定义名称 | 说明 |
---|---|
master | 提供volume=>location 位置映射服务和文件id的序列号 |
Node | 系统抽象的节点,抽象为DataCenter、Rack |
Datanode | 存储节点,用于管理、存储逻辑卷 |
DataCenter | 数据中心,对应现实中的不同机架 |
Rack | 机架,对应现实中的机柜,一个机架属于特定的数据中心,一个数据中心可以包含多个机架。 |
Volume | 逻辑卷,存储的逻辑结构,逻辑卷下存储Needle,A VolumeServer contains one Store |
Needle | 逻辑卷中的Object,对应存储的文件, Needle file size is limited to 4GB for now. |
Collection | 文件集,可以分布在多个逻辑卷上,如果在存储文件的时候没有指定collection,那么使用默认的"" |
Filer | 文件管理器,Filer将数据上传到Weed Volume Servers,并将大文件分成块,将元数据和块信息写入Filer存储区。 |
Mount | 用户空间,当filer与mount一起使用时,filer仅提供文件元数据检索,实际文件内容直接在mount和volume服务器之间读写,所以不需要多个filer |
使用 $ ./weed -h 查看命令及说明
使用 $ ./weed [command] -h 查看命令参数及说明
节点 | master | volume | filer |
---|---|---|---|
cdh1 | √ | √ | √ |
cdh2 | √ | √ | √ |
cdh3 | √ | √ | √ |
$ tar -zxvf ./linux_amd64.tar.gz
得到 weed 文件
创建文件夹:
$ mkdir seaweedfd_master
$ mkdir seaweedfd_data
启动master命令:
$ ./weed master -ip cdh1 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
$ ./weed master -ip cdh2 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
$ ./weed master -ip cdh3 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
避免脑裂:Only odd number of masters are supported!
后台运行:$ nohup ./weed master -ip cdh3 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001 > weed_master.out &
想对外提供服务必须存活两台master
启动volume:
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh1 -ip.bind cdh1 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh1 -rack rack1
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh2 -ip.bind cdh2 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh2 -rack rack1
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh3 -ip.bind cdh3 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh3 -rack rack1
dataCenter: 数据中心名称
rack: 机架名称
后台启动:$ nohup ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh1 -ip.bind cdh1 -maxCpu 1 -max 200 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh1 -rack rack1 > weed_volume.out &
访问master webUI:
http://cdh3:9333/
命令上传文件目录:
$ ./weed upload -dataCenter dc1 -master=cdh3:9333 -dir="./dir/"
分配文件key:
# 基本使用:
$ curl http://cdh1:9333/dir/assign
# 指定复制类型:
$ curl "http://cdh1:9333/dir/assign?replication=001"
# 指定保存时间
$ curl "http://cdh1:9333/dir/assign?count=5"
# 指定数据中心
$ curl "http://cdh1:9333/dir/assign?dataCenter=dc1"
上传文件例子:
# 获取file key
$ curl "http://cdh1:9333/dir/assign?dataCenter=dc1"
# 返回JSON
{"fid":"2,016beb339d","url":"cdh2:9222","publicUrl":"cdh2","count":1}
# 上传一个文件指定fid
$ curl -F file=@./file http://cdh2:9222/2,016beb339d
# 返回JSON
{"name":"file","size":41629428}
获取文件:
$ curl http://cdh2:9222/2,016beb339d
配置启动filer:
# 查看配置文件 filer.toml
$ ./weed scaffold filer
默认使用leveldb管理文件
# 生成配置文件
$ ./weed scaffold -config filer -output="."
# 示例使用postgres作为元数据存储
# 创建表
=========================================
CREATE TABLE IF NOT EXISTS filemeta (
dirhash BIGINT,
name VARCHAR(1000),
directory VARCHAR(4096),
meta bytea,
PRIMARY KEY (dirhash, name)
);
=========================================
# 配置 filer.toml 中的[postgres]
$ vi filer.toml
启动:
$ ./weed filer -master cdh1:9333,cdh2:9333,cdh3:9333 -port 8888 -port.public 8889
后台启动 $ nohup ./weed filer -master cdh1:9333,cdh2:9333,cdh3:9333 -port 8888 > weed_filer.out &
建议启动多台,多台共享一个数据库
上传文件:
$ curl -F "[email protected]" "http://cdh1:8888/path/to/sources/"
访问webUI页面:
http://cdh1:8888/
# MavenCentral下载最新版本
https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop-client
# 确认有 mapred-site.xml 文件
# 测试 ls
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-ls /
# 返回
Found 2 items
drwxrwx--- - 0 2018-12-13 10:29 /path
drwxrwx--- - 0 2018-12-13 14:17 /weed
# 测试上传文件
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-put ./slaves /
# 测试下载文件夹
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-get /path
配置Hadoop:
$ vi core-site.xml
fs.seaweedfs.impl
seaweed.hdfs.SeaweedFileSystem
fs.defaultFS
seaweedfs://cdh1:8888
# 配置SeaweedFS HDFS客户端jar
$ bin/hadoop classpath
$ cp ./seaweedfs-hadoop-client-1.0.2.jar /hadoop/share/hadoop/common/lib/
$ scp ./seaweedfs-hadoop-client-1.0.2.jar cdh2:/hadoop/share/hadoop/common/lib
$ scp ./seaweedfs-hadoop-client-1.0.2.jar cdh3:/hadoop/share/hadoop/common/lib
$ scp ./core-site.xml cdh2:/hadoop/etc/hadoop/
$ scp ./core-site.xml cdh3:/hadoop/etc/hadoop/
# 查看
$ ../../bin/hdfs dfs -ls seaweedfs://cdh3:8888/
# 返回
Found 3 items
drwxrwx--- - 0 2018-12-13 10:29 seaweedfs://cdh3:8888/path
-rw-r--r-- 1 dpnice dpnice 15 2018-12-13 14:41 seaweedfs://cdh3:8888/slaves
drwxrwx--- - 0 2018-12-13 14:17 seaweedfs://cdh3:8888/weed
分配文件密钥:
# Basic Usage:
curl http://localhost:9333/dir/assign
# To assign with a specific replication type:
curl "http://localhost:9333/dir/assign?replication=001"
# To specify how many file ids to reserve
curl "http://localhost:9333/dir/assign?count=5"
# To assign a specific data center
curl "http://localhost:9333/dir/assign?dataCenter=dc1"
查找volume的地址:
curl "http://localhost:9333/dir/lookup?volumeId=3&pretty=y"
{
"locations": [
{
"publicUrl": "localhost:8080",
"url": "localhost:8080"
}
]
}
# Other usages:
# You can actually use the file id to lookup, if you are lazy to parse the file id.
curl "http://localhost:9333/dir/lookup?volumeId=3,01637037d6"
# If you know the collection, specify it since it will be a little faster
curl "http://localhost:9333/dir/lookup?volumeId=3&collection=turbo"
垃圾回收:
curl "http://localhost:9333/vol/vacuum"
curl "http://localhost:9333/vol/vacuum?garbageThreshold=0.4"
垃圾回收将创建.dat和.idx文件的副本,跳过已删除的文件,保留副本删除原文件。
garbageThreshold 是可选的。
预分配卷:
# specify a specific replication
curl "http://localhost:9333/vol/grow?replication=000&count=4"
{"count":4}
# specify a collection
curl "http://localhost:9333/vol/grow?collection=turbo&count=4"
# specify data center
curl "http://localhost:9333/vol/grow?dataCenter=dc1&count=4"
# specify ttl
curl "http://localhost:9333/vol/grow?ttl=5d&count=4"
count代表生成几个空volume
删除集合:
# delete a collection
curl "http://localhost:9333/col/delete?collection=benchmark&pretty=y"
检查系统状态:
# 集群状态
curl "http://10.0.2.15:9333/cluster/status?pretty=y"
{
"IsLeader": true,
"Leader": "10.0.2.15:9333",
"Peers": [
"10.0.2.15:9334",
"10.0.2.15:9335"
]
}
# 拓扑状态
curl "http://localhost:9333/dir/status?pretty=y"
{
"Topology": {
"DataCenters": [
{
"Free": 567,
"Id": "dc1",
"Max": 600,
"Racks": [
{
"DataNodes": [
{
"Free": 190,
"Max": 200,
"PublicUrl": "cdh2",
"Url": "cdh2:9222",
"Volumes": 10
},
{
"Free": 190,
"Max": 200,
"PublicUrl": "cdh1",
"Url": "cdh1:9222",
"Volumes": 10
},
{
"Free": 187,
"Max": 200,
"PublicUrl": "cdh3",
"Url": "cdh3:9222",
"Volumes": 13
}
],
"Free": 567,
"Id": "rack1",
"Max": 600
}
]
}
],
"Free": 567,
"Max": 600,
"layouts": [
{
"collection": "",
"replication": "001",
"ttl": "5d",
"writables": [
15,
16,
17,
18
]
},
{
"collection": "",
"replication": "000",
"ttl": "",
"writables": [
13,
14,
10,
11,
12,
19,
20,
21,
22
]
},
{
"collection": "",
"replication": "001",
"ttl": "",
"writables": [
6,
3,
7,
2,
4,
5
]
},
{
"collection": "turbo",
"replication": "001",
"ttl": "",
"writables": [
8,
9
]
}
]
},
"Version": "1.11"
}
Volume Server API:
# 上传文件
curl -F file=@/home/chris/myphoto.jpg http://127.0.0.1:8080/3,01637037d6
前置需要向master分配文件的key
# 直接上传文件自动分配key( master的端口)
curl -F file=@/home/chris/myphoto.jpg http://localhost:9333/submit
{"fid":"3,01fbe0dc6f1f38","fileName":"myphoto.jpg","fileUrl":"localhost:8080/3,01fbe0dc6f1f38","size":68231}
# 删除文件
curl -X DELETE http://127.0.0.1:8080/3,01637037d6
# 查看分块大文件的列表文件内容
curl http://127.0.0.1:8080/3,01637037d6?cm=false
# 检查 Volume Server 的状态
curl "http://localhost:8080/status?pretty=y"
{
"Version": "0.34",
"Volumes": [
{
"Id": 1,
"Size": 1319688,
"RepType": "000",
"Version": 2,
"FileCount": 276,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 2,
"Size": 1040962,
"RepType": "000",
"Version": 2,
"FileCount": 291,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 3,
"Size": 1486334,
"RepType": "000",
"Version": 2,
"FileCount": 301,
"DeleteCount": 2,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 4,
"Size": 8953592,
"RepType": "000",
"Version": 2,
"FileCount": 320,
"DeleteCount": 2,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 5,
"Size": 70815851,
"RepType": "000",
"Version": 2,
"FileCount": 309,
"DeleteCount": 1,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 6,
"Size": 1483131,
"RepType": "000",
"Version": 2,
"FileCount": 301,
"DeleteCount": 1,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 7,
"Size": 46797832,
"RepType": "000",
"Version": 2,
"FileCount": 292,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
}
]
}
Filer Server API:
# Basic Usage:
# create or overwrite the file, the directories /path/to will be automatically created
curl -F [email protected] "http://localhost:8888/path/to"
{"name":"report.js","size":866,"fid":"7,0254f1f3fd","url":"http://localhost:8081/7,0254f1f3fd"}
# get the file content
curl "http://localhost:8888/javascript/report.js"
# upload the file with a different name
curl -F [email protected] "http://localhost:8888/javascript/new_name.js"
{"name":"report.js","size":866,"fid":"3,034389657e","url":"http://localhost:8081/3,034389657e"}
# list all files under /javascript/
curl -H "Accept: application/json" "http://localhost:8888/javascript/?pretty=y"
{
"Directory": "/javascript/",
"Files": [
{
"name": "new_name.js",
"fid": "3,034389657e"
},
{
"name": "report.js",
"fid": "7,0254f1f3fd"
}
],
"Subdirectories": null
}
# 分页查看文件列表
curl "http://localhost:8888/javascript/?pretty=y&lastFileName=new_name.js&limit=2"
{
"Directory": "/javascript/",
"Files": [
{
"name": "report.js",
"fid": "7,0254f1f3fd"
}
]
}
# 删除文件
curl -X DELETE "http://localhost:8888/javascript/report.js"