“Mesos是Apache下的开源分布式资源管理框架,它被称为是分布式系统的内核。Mesos最初是由加州大学伯克利分校的AMPLab开发的,后在Twitter得到广泛使用”------百度百科。
一、出现背景
随着互联网的发展,各种大数据计算框架不断出现,支持离线处理的MapReduce、在线处理的Storm,迭代计算框架Spark、及流式处理框架S4……各种分布式计算框架应运而生,各自解决某一类应用中的问题。而在互联网公司中,几种不同的框架都可能会被采用。考虑到资源的利用率、运维成本、数据共享等因素,我们往往会把不同的计算框架部署到一个公共的集群中,使其共享集群资源,而不同任务往往需要的资源(CPU、内存、网络I/O等)不同,它们运行在同一个集群中不免会相互干扰、资源竞争导致效率低下,因此就诞生了资源统一管理与调度平台,两个典型代表---Mesos和Yarn
二、Mesos框架
Mesos Architecture
The above figure shows the main components of Mesos. Mesos consists of a
master daemon that manages
agent daemons running on each cluster node, and
Mesos frameworks that run
tasks on these agents.
The master enables fine-grained sharing of resources (CPU, RAM, …) across frameworks by making them resource offers. Each resource offer contains a list of
(NOTE: as keyword ‘slave’ is deprecated in favor of ‘agent’, driver-based frameworks will still receive offers with slave ID, whereas frameworks using the v1 HTTP API receive offers with agent ID). The master decides how many resources to offer to each framework according to a given organizational policy, such as fair sharing or strict priority. To support a diverse set of policies, the master employs a modular architecture that makes it easy to add new allocation modules via a plugin mechanism.
A framework running on top of Mesos consists of two components: a scheduler that registers with the master to be offered resources, and an executor process that is launched on agent nodes to run the framework’s tasks (see the App/Framework development guide for more details about framework schedulers and executors). While the master determines how many resources are offered to each framework, the frameworks' schedulers select which of the offered resources to use. When a framework accepts offered resources, it passes to Mesos a description of the tasks it wants to run on them. In turn, Mesos launches the tasks on the corresponding agents.
Example of resource offer
The figure below shows an example of how a framework gets scheduled to run a task.
Let’s walk through the events in the figure.
1.Agent 1 reports to the master that it has 4 CPUs and 4 GB of memory free.
2. The master then invokes the allocation policy module, which tells it that framework 1 should be offered all available resources.The master sends a resource offer describing what is available on agent 1 to framework 1.
3.The framework’s scheduler replies to the master with information about two tasks to run on the agent, using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.
4.Finally, the master sends the tasks to the agent, which allocates appropriate resources to the framework’s executor, which in turn launches the two tasks (depicted with dotted-line borders in the figure). Because 1 CPU and 1 GB of RAM are still unallocated, the allocation module may now offer them to framework 2.
In addition, this resource offer process repeats when tasks finish and new resources become free.
While the thin interface provided by Mesos allows it to scale and allows the frameworks to evolve independently, one question remains: how can the constraints of a framework be satisfied without Mesos knowing about these constraints? For example, how can a framework achieve data locality without Mesos knowing which nodes store the data required by the framework? Mesos answers these questions by simply giving frameworks the ability to reject offers. A framework will reject the offers that do not satisfy its constraints and accept the ones that do. In particular, we have found that a simple policy called delay scheduling, in which frameworks wait for a limited time to acquire nodes storing the input data, yields nearly optimal data locality.
以上部分摘自mesos官网,用于初学者对mesos理解
正式安装
一、机器及系统准备
机器准备3台 Ubuntu 16.04
分配IP:192.168.0.124,192.168.0.125,192.168.0.126
参照mesos官方配置
Ubuntu 16.04
Following are the instructions for stock Ubuntu 16.04. If you are using a different OS, please install the packages accordingly.
# Update the packages.
$ sudo apt-get update
# Install a few utility tools.
$ sudo apt-get install -y tar wget git
# Install the latest OpenJDK.
$ sudo apt-get install -y openjdk-8-jdk
# Install autotools (Only necessary if building from git repository).
$ sudo apt-get install -y autoconf libtool
# Install other Mesos dependencies.
$ sudo apt-get -y install build-essential python-dev libcurl4-nss-dev libsasl2-dev libsasl2-modules maven libapr1-dev libsvn-dev zlib1g-dev
二、环境及安装
<1>mesos源码安装包
<2> oracle jdk-1.8.0_171(安装略过请参考unbunt基础配置中jdk安装部分)
- Download the latest stable release from Apache (Recommended)
$ wget http://www.apache.org/dist/mesos/1.2.0/mesos-1.2.0.tar.gz
$ tar -zxf mesos-1.2.0.tar.gz
$ cd mesos-1.2.0
$ mkdir -p /usr/local/mesos/mesos
$./configure --prefix=/usr/local/mesos/mesos
$make -j4 && make -j4 install
- Clone the Mesos git repository (Advanced Users Only)
$ git clone https://git-wip-us.apache.org/repos/asf/mesos.git
Building Mesos (Posix)
# Change working directory.
$ cd mesos
# Bootstrap (Only required if building from git repository).
$ ./bootstrap
# Configure and build.
$ mkdir build
$ cd build
$ ../configure
$ make
In order to speed up the build and reduce verbosity of the logs, you can append -j V=0 to make.
# Run test suite.
$ make check
# Install (Optional).
$ make install
make 过程中出现错误:
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
解决办法:
主要原因大体上是因为内存不足,有点坑 临时使用交换分区来解决吧
sudo dd if=/dev/zero of=/swapfile bs=64M count=16
sudo mkswap /swapfile
sudo swapon /swapfile
After compiling, you may wish to
Code:
sudo swapoff /swapfile
sudo rm /swapfile
重复以上步骤安装完三台机
安装zookeeper并启动
三、配置并启动
./mesos-master.sh --ip=192.168.0.126 --work_dir=/tmp/mesos --zk=zk://192.168.0.126:2181,192.168.0.125:2181,192.168.0.124:2181/mesos --quorum=1
./mesos-master.sh --ip=192.168.15.125 --work_dir=/tmp/mesos --zk=zk://192.168.15.126:2181,192.168.15.125:2181,192.168.12.124:2181/mesos --quorum=1
./mesos-master.sh --ip=192.168.15.124 --work_dir=/tmp/mesos --zk=zk://192.168.15.126:2181,192.168.15.125:2181,192.168.12.124:2181/mesos --quorum=1
./mesos-agent.sh --master=192.168.15.126:5050 --work_dir=/tem_mesos (125,124 slave机器上启动,注意 加--port)