2016年5月13日-15日,由CSDN重磅打造的2016中国云计算技术大会(CCTC 2016)将于5月13日-15日在北京举办,今年大会特设“中国Spark技术峰会”、“Container技术峰会”、“OpenStack技术峰会”、“大数据核心技术与应用实战峰会”四大技术主题峰会,以及“云计算核心技术架构”、“云计算平台构建与实践”等专场技术论坛。大会讲师阵容囊括Intel、微软、IBM、AWS、Hortonworks、Databricks、Elastic、百度、阿里、腾讯、华为、乐视、京东、小米、微博、迅雷、国家电网、中国移动、长安汽车、广发证券、民生银行、国家超级计算广州中心等60+顶级技术讲师,CCTC必将是中国云计算技术开发者的顶级盛会。目前会议门票限时7折(截止至4月29日24点),详情访问CCTC 2016官网。
MESOS 提供了Scheduler HTTP RESTful API. 不论是软件开发人员还是软件测试人员都可以通过这些API 进一步了解mesos的工作机制。本文讲述了如何用curl来访问这些API。
1.mesos master,master的command options 设置如下:
MESOS_MASTER_OPTS="--work_dir=/var/lib/mesos --log_dir=/var/log/mesos"
2 mesos slaves,每个slave的配置是一样的,2 CPU,996M memory。slave的command options设置如下
MESOS_SLAVE_OPTS="--master=$MESOS_MASTER_IP:5050 --log_dir=/var/log/mesos"
1.注册framework是framework与mesos master通信的第一步。Framework注册成功之后,才会建立起和mesos master之间的联系。运行以下命令来注册一个framework。
# curl –vv --no-buffer –X POST -H "Content-Type: application/json" [email protected] http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
curl 后面的-d 指的是传送给API的data. 这个data是个json串。在命令行中写长的json串是个比较痛苦的事情,最好将json串写进一个文件。本例中该文件的名字为register.json. json文件怎么定义可以查看mesos.proto, scheduler.proto.
Response body like this:
* upload completely sent off: 257 out of 257 bytes
< HTTP/1.1 200 OK
< Date: Fri, 15 Apr 2016 05:56:03 GMT
< Mesos-Stream-Id: b3a3e239-f0ad-43d3-b635-1c1d9c66566b
< Content-Type: application/json
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Transfer-Encoding: chunked
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
<
70
{"subscribed":{"framework_id":{"value":"test13"}},"type":"SUBSCRIBED"}20
{"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O4"},"resources":[{"name":"cpus","role":"*","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":996.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S0"},"framework_id":{"value":"test13"},"hostname":"slave0","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O5"},"resources":[{"name":"cpus","role":"*","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":996.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"slave0","ip":"9.111.254.62","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}
Mesos-Stream-Id, 作为该framework的唯一标识。在后面的actions 像launch task, decline offers等都需要用到它.
framework 注册成功之后,如果master 有资源的话,该framework会收到master发送来的offers。
register.json文件内容示例:
{
"framework_id": {"value" : "test13"},
"type":"SUBSCRIBE",
"subscribe":{
"framework_info":{
"user":"root",
"name":"test13",
"failover_timeout":60,
"role":"aa",
"id":{"value":"test13"},
"principal":"test",
"capabilities":{"type":"REVOCABLE_RESOURCES"}
},
"force":true
}
}
2.使用已注册好的framework 来launch task。
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@launch_task.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this:
* Done waiting for 100-continue
< HTTP/1.1 202 Accepted
< Date: Wed, 13 Apr 2016 05:37:44 GMT
< Content-Length: 0
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
launch_task.json文件内容示例:
{
"framework_id":{"value":"test13"},
"type":"ACCEPT",
"accept":{
"offer_ids":[
{"value":"81c111bd-f27d-41a4-b184-64090dec3048-O428"}
],
"filters":{},
"operations":[
{
"type":"LAUNCH",
"launch":{"task_infos":[{
"name":"task 00012",
"task_id":{"value": "00012"},
"agent_id":{"value": "5bb4aba0-628b-4309-b36f-767db1ebb7f4-S0"},
"resources":[ { "name":"mem", "type":"SCALAR", "scalar":{"value":996.0} }, { "name":"cpus", "type":"SCALAR", "scalar":{"value":1.0} } ],
"command":{"value": "/bin/sleep 3000s"} }
]
}
}
]
}
}
Result
查看framework output , framework 收到了UPDATE event, 这是MESOS master发给framework更新task 状态的UPDATE
{"type":"UPDATE","update":{"status":{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"container_status":{"network_infos":[{"ip_address":"9.111.254.36","ip_addresses":[{"ip_address":"9.111.254.36"}]}]},"executor_id":{"value":"00016"},"source":"SOURCE_EXECUTOR","state":"TASK_RUNNING","task_id":{"value":"00016"},"timestamp":1460526113.05411,"uuid":"H9KoCldtRTu0T7halLxs9w=="}}}20
{"type":"HEARTBEAT"}391
注意:master 会将当前offer剩余的资源作为新的offer 提供给其它的framework。形成一个新的资源。
3. Framework 收到task status 之后,需要发acknowledge event 给master ,表示已经收到。master就会一直不停的发同样的event给framework直到它收到该event的acknowledge
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" [email protected] http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this:
* upload completely sent off: 264 out of 264 bytes
< HTTP/1.1 202 Accepted
< Date: Wed, 13 Apr 2016 06:01:31 GMT
< Content-Length: 0
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
acknowledge.json的文件内容示例如下:
{
"framework_id" : {"value" : "test13"},
"type" : "ACKNOWLEDGE",
"acknowledge" : {
"agent_id" : {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},
"task_id" : {"value" : "00016"},
"uuid" : "H9KoCldtRTu0T7halLxs9w=="
}
}
Result
在Framework的output里,同样UUID的event信息不再出现
4. 当task 运行完毕的时候,它所占用的资源会被释放掉,从而一个新的资源产生了,master又将该资源按照DRF算法发给了相应的framework。
5. Decline offer, 如果目前分配给framework的offer不合适,framework可以decline it。
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" [email protected] http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 214 out of 214 bytes
< HTTP/1.1 202 Accepted
< Date: Wed, 13 Apr 2016 06:17:56 GMT
< Content-Length: 0
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
<
decline.json 文件内容示例:
{
"framework_id" : {"value" : "test13"},
"type" : "DECLINE",
"decline" : {
"offer_ids" : [
{"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O15"}
]
}
}
Result
被decline的offer不再可用。
6. 如果在mesos cluster 运行很多个task,在mesos cluster 系统里面就会不停的创造出很多的资源碎片,这些碎片都不能单独的launch task的时候,该怎么办呢?有两个办法可以合并同一个slave上面的资源碎片。
6.1 ACCEPT event
在launch task的时候,将slave上面的碎片资源写成list,这样会自动合并碎片资源。注意这些资源碎片一定是属于同一个slave
Launch_task.json 文件内容示例:
{
"framework_id":{"value":"test13"},
"type":"ACCEPT",
"accept":{
"offer_ids":[
{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O14"},
{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O16"}
],
"filters":{},
"operations":[
{
"type":"LAUNCH",
"launch":{"task_infos":[{
"name":"task 00017",
"task_id":{"value": "00017"},
"agent_id":{"value": "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},
"resources":[ { "name":"mem", "type":"SCALAR", "scalar":{"value":996.0} }, { "name":"cpus", "type":"SCALAR", "scalar":{"value":2.0} } ],
"command":{"value": "/bin/sleep 180s"} }
]
}
}
]
}
}
当task运行完毕,mesos master就会将这些资源作为整块资源按照DRF算法发给相应的framework
6.2 decline event 合并碎片资源
将各个碎片资源decline掉, 这样也能整合碎片资源。 例如有三个offers,这3个offer都属于同一个slave。decline 这些offer之后,mesos master会自动整合这些碎片资源,然后按照DRF算法将合并后的资源发给相应的framework
{"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O24"},"resources":[{"name":"cpus","role":"*","scalar":{"value":1.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":512.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}
{"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O26"},"resources":[{"name":"cpus","role":"*","scalar":{"value":1.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":256.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}
{"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O27"},"resources":[{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":228.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}
decline offer 内容如下:
{
"framework_id" : {"value" : "test13"},
"type" : "DECLINE",
"decline" : {
"offer_ids" : [
{"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O24"},
{"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O26"},
{"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O27"}
]
}
}
decline 成功以后,一个整合过的offer就会发送给相应的framework
7. kill task。
kill event主要用于kill special task,一次kill一个task
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@kill_task.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 211 out of 211 bytes
< HTTP/1.1 202 Accepted
< Date: Fri, 15 Apr 2016 13:08:25 GMT
< Content-Length: 0
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
kill_task.json 文件内容示例:
{
"framework_id" : {"value" : "test13"},
"type" : "KILL",
"kill" : {
"task_id" : {"value" : "00020"},
"agent_id" : {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"}
}
}
Result
The output in framework like this:
{"type":"UPDATE","update":{"status":{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S0"},"container_status":{"network_infos":[{"ip_address":"9.111.254.62","ip_addresses":[{"ip_address":"9.111.254.62"}]}]},"executor_id":{"value":"00012"},"message":"Command terminated with signal Terminated","source":"SOURCE_EXECUTOR","state":"TASK_KILLED","task_id":{"value":"00012"},"timestamp":1460725705.73596,"uuid":"mKSkxCycRLKS8y4YA2HM4A=="}}}
该task的资源被释放重新利用。如果只有一个framework的话,在当前framework的output中就会有一个新的offer。
8. TEARDOWN framework
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id: b3a3e239-f0ad-43d3-b635-1c1d9c66566b" [email protected] http://9.111.254.199:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 77 out of 77 bytes
< HTTP/1.1 202 Accepted
< Date: Fri, 15 Apr 2016 13:25:49 GMT
< Content-Length: 0
< X-Cache: MISS from db03b04
< X-Cache-Lookup: MISS from db03b04:3128
< Via: 1.1 db03b04 (squid/3.3.8)
< Connection: keep-alive
teardown.json 文件内容如下:
{
"framework_id" : {"value" : "test13"},
"type" : "TEARDOWN"
}
Result
注册framework的进程会自动退出。Framework 变成inactive
More information about mesos RESTful api ,please refer to
http://mesos.apache.org/documentation/latest/scheduler-http-api/
作者简介:高智芳,IBM软件测试工程师,主要从事云计算领域相关的工作,平时喜欢尝试新的测试技术和测试方法。乐于在测试中发现bug。