一、这个问题困扰了好几天,单个节点的pod之间通信没问题,但是当涉及到多个节点之间的通信就不行。最后问题锁定在flannel打通不同节点之间的网络的问题。
二、实验环境
1、ubuntu-14.04、k8s-1.4、flannel-0.5.5、etcd-2.3、docker-1.12.3
2、k8s集群总共有三个节点,一个master(192.168.110.151,同时作为私有registry仓库),minion1(192.168.110.152),minion2(192.168.110.154)
三、实验过程
1、配置集群相关软件
(1)master的etcd(放在/home/docker/xu/etcd目录下)
root@master:/home/docker/xu/etcd# tree
.
├── etcd
├── etcd0.etcd
│ └── member
│ ├── snap
│ │ ├── 000000000000005c-0000000000013889.snap
│ │ ├── 0000000000000085-0000000000015f9a.snap
│ │ ├── 00000000000000bc-00000000000186ab.snap
│ │ ├── 00000000000000bf-000000000001adbc.snap
│ │ └── 00000000000000bf-000000000001d4cd.snap
│ └── wal
│ ├── 0000000000000000-0000000000000000.wal
│ └── 0000000000000001-0000000000012017.wal
├── etcdctl
└── run.sh
其中run.sh是启动脚本,内容如下:
killall -9 etcd
./etcd \
-name etcd0 \
-data-dir etcd0.etcd \
-initial-advertise-peer-urls http://master:2380 \
-listen-peer-urls http://master:2380 \
-listen-client-urls http://master:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://master:2379 \
-initial-cluster-token etcd-cluster \
-initial-cluster etcd0=http://master:2380,etcd1=http://dockertest4:2380,etcd2=http://dockertest5:2380 \
-initial-cluster-state new
(2)minion1的etch配置
(放在/home/docker/xu/etcd目录下)
其中run.sh是启动脚本,内容如下:
killall -9 etcd
./etcd \
-name etcd1 \
-data-dir etcd1.etcd \
-initial-advertise-peer-urls http://dockertest4:2380 \
-listen-peer-urls http://dockertest4:2380 \
-listen-client-urls http://dockertest4:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://dockertest4:2379 \
-initial-cluster-token etcd-cluster \
-initial-cluster etcd0=http://master:2380,etcd1=http://dockertest4:2380,etcd2=http://dockertest5:2380 \
-initial-cluster-state new
(3)minion2的etcd配置
(放在/home/docker/xu/etcd目录下)
root@dockertest5:/home/docker/xu/etcd# tree
.
├── etcd
├── etcd2.etcd
│ └── member
│ ├── snap
│ │ ├── 000000000000005c-0000000000013889.snap
│ │ ├── 0000000000000085-0000000000015f9a.snap
│ │ ├── 00000000000000bc-00000000000186ab.snap
│ │ ├── 00000000000000bf-000000000001adbc.snap
│ │ └── 00000000000000bf-000000000001d4cd.snap
│ └── wal
│ ├── 0000000000000000-0000000000000000.wal
│ └── 0000000000000001-0000000000012006.wal
├── etcdctl
└── run.sh
run.sh是启动配置文件,文件内容如下
killall -9 etcd
./etcd \
-name etcd2 \
-data-dir etcd2.etcd \
-initial-advertise-peer-urls http://dockertest5:2380 \
-listen-peer-urls http://dockertest5:2380 \
-listen-client-urls http://dockertest5:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://dockertest5:2379 \
-initial-cluster-token etcd-cluster \
-initial-cluster etcd0=http://master:2380,etcd1=http://dockertest4:2380,etcd2=http://dockertest5:2380 \
-initial-cluster-state new
root@master:/home/docker/xu/kubernetes/server/bin# tree
.
├── hyperkube
├── kube-apiserver
├── kube-apiserver.docker_tag
├── kube-controller-manager
├── kube-controller-manager.docker_tag
├── kubectl
├── kube-dns
├── kubelet
├── kubemark
├── kube-proxy
├── kube-proxy.docker_tag
├── kube-scheduler
├── kube-scheduler.docker_tag
├── run-apiserver.sh
├── run-controller-manager.sh
└── run-scheduler.sh
其中run-apiserver.sh、run-controller-manager.sh和run-scheduler.sh分别是启动脚本,脚本内容分别如下:
./kube-apiserver --address=0.0.0.0 --insecure-port=8080 --service-cluster-ip-range='192.168.110.0/24' --kubelet_port=10250 --v=0 --logtostderr=true --etcd_servers=http://192.168.110.151:2379 --allow_privileged=false >> /opt/k8s/kube-apiserver.log 2>&1 &
./kube-controller-manager --v=0 --logtostderr=false --log_dir=/opt/k8s/kube --master=192.168.110.151:8080 >> /opt/k8s/kube-controller-manager.log 2>&1 &
~
./kube-scheduler --master='192.168.110.151:8080' --v=0 --log_dir=/opt/k8s/kube >> /opt/k8s/kube-scheduler.log 2>&1 &
root@dockertest4:/home/docker/xu/kubernetes/server/bin# tree
.
├── hyperkube
├── kube-apiserver
├── kube-apiserver.docker_tag
├── kube-controller-manager
├── kube-controller-manager.docker_tag
├── kubectl
├── kube-dns
├── kubelet
├── kubemark
├── kube-proxy
├── kube-proxy.docker_tag
├── kube-scheduler
├── kube-scheduler.docker_tag
├── run-let.sh
└── run-proxy.sh
其中run-proxy.sh和run-let.sh分别是启动脚本,内容分别如下
./kube-proxy --logtostderr=true --v=0 --master=http://192.168.110.151:8080 --hostname_override=192.168.110.152 >> /opt/k8s/kube-proxy.log
./kubelet --logtostderr=true --v=0 --allow-privileged=false --log_dir=/opt/k8s/kube --address=0.0.0.0 --port=10250 --hostname_override=192.168.110.152 --api_servers=http://192.168.110.151:8080 >> /opt/k8s/kube-kubelet.log
(6)minion2的k8s的配置(kube-proxy、kubelet他们所在的目录是/home/docker/xu/k8s/server/bin)
root@dockertest5:/home/docker/xu/k8s/server/bin# tree
.
├── hyperkube
├── kube-apiserver
├── kube-apiserver.docker_tag
├── kube-controller-manager
├── kube-controller-manager.docker_tag
├── kubectl
├── kube-dns
├── kubelet
├── kubemark
├── kube-proxy
├── kube-proxy.docker_tag
├── kube-scheduler
├── kube-scheduler.docker_tag
├── run-let.sh
└── run-proxy.sh
./kube-proxy --logtostderr=true --v=0 --master=http://192.168.110.151:8080 --hostname_override=192.168.110.154 >> /opt/k8s/kube-proxy.log
./kubelet --logtostderr=true --v=0 --allow-privileged=false --log_dir=/opt/k8s/kube --address=0.0.0.0 --port=10250 --hostname_override=192.168.110.154 --api_servers=http://192.168.110.151:8080 >> /opt/k8s/kube-kubelet.log
~
1、master节点的etcd,在master节点的etcd所在目录下执行./run.sh
2、minion1节点的etcd,在master节点的etcd所在目录下执行./run.sh
3、minion2节点的etcd,在master节点的etcd所在目录下执行./run.sh
4、验证etcd是否启动成功
在master节点的etcd所在目录下执行./etcdctl member list,看见下面的输出,说明etcd集群启动成功
root@master:/home/docker/xu/etcd# ./etcdctl member list
35e013635b05ca4f: name=etcd1 peerURLs=http://dockertest4:2380 clientURLs=http://dockertest4:2379 isLeader=true
70192b54fb86c1a5: name=etcd0 peerURLs=http://master:2380 clientURLs=http://master:2379 isLeader=false
835aada27b736086: name=etcd2 peerURLs=http://dockertest5:2380 clientURLs=http://dockertest5:2379 isLeader=false
5、在master节点的flannel,在flannel所在目录下执行如下命令
./flanneld -etcd-endpoints http://192.168.110.151:2379
其中,http://192.168.110.151:2379 是etcd的地址
6、minion1的flannel,在flannel所在目录下执行如下命令:
./flanneld -etcd-endpoints http://192.168.110.151:2379
7、
minion2的flannel,在flannel所在目录下执行如下命令:
./flanneld -etcd-endpoints http://192.168.110.151:2379
8、在master节点的etcd目录下执行如下命令:
etcdctl set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }'
9、在master节点的etcd所在目录执行如下命令,确认上一步命令是否成功
root@master:/home/docker/xu/etcd# ./etcdctl ls /coreos.com/network/subnets
/coreos.com/network/subnets/10.1.44.0-24
/coreos.com/network/subnets/10.1.54.0-24
/coreos.com/network/subnets/10.1.60.0-24
10、然后分别在每个节点下执行如下命令
(1)mk-docker-opts.sh -i
(2)source /run/flannel/subnet.env
(3)ifconfig docker0 ${FLANNEL_SUBNET}
(4)sudo service docker stop
(5) dockerd -dns 8.8.8.8 --dns 8.8.4.4 --insecure-registry 192.168.110.151:5000 -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU}
11、查看每个节点的路由信息,确认flannel是否打通网络
root@master:/home/docker/xu/flannel# route -n
内核 IP 路由表
目标 网关 子网掩码 标志 跃点 引用 使用 接口
0.0.0.0 192.168.110.2 0.0.0.0 UG 0 0 0 eth0
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0
10.1.60.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
192.168.110.0 0.0.0.0 255.255.255.0 U 1 0 0 eth0
12、master节点的k8s,在k8s所在目录下执行如下命令
./run-apiserver.sh
./run-controller-manager.sh
./run-scheduler.sh
13、分别在minion1和minion2的节点上执行如下命令
./run-proxy.sh
./run-let.sh
14、执行如下命令,看见相应输出,说明k8s启动成功
root@master:/home/docker/xu/flannel# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
scheduler Healthy ok
root@master:/home/docker/xu/flannel# kubectl get nodes
NAME STATUS AGE
192.168.110.152 Ready 1d
192.168.110.154 Ready 1d
3、验证环境
(1)本实验是javaweb程序链接mysq数据库,其中在minion2上运行1个mysql pod,然后分别在minion1和minion2上运行一个tomcat(镜像里面包含看wen应用测试代码)。
(2)如果minion1上的tomcat也能访问到mysql数据库,说明两个节点的网络已经打通
(3)mysql.ysml和tomcat.yaml的文件内容分别如下
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
ports:
- port: 3306
selector:
app: mysql_pod
---
apiVersion: v1
kind: ReplicationController
metadata:
name: mysql-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: mysql_pod
spec:
containers:
- name: mysql
image: 192.168.110.151:5000/mysql
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: "123456"
apiVersion: v1
kind: Service
metadata:
name: hpe-java-web
spec:
type: NodePort
ports:
- port: 8080
nodePort: 31002
selector:
app: hpe_java_web_pod
---
apiVersion: v1
kind: ReplicationController
metadata:
name: hpe-java-web-deployement
spec:
replicas: 2
template:
metadata:
labels:
app: hpe_java_web_pod
spec:
containers:
- name: myweb
image: 192.168.110.151:5000/tomact8
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
(4)分别启动mysql和tomcat容器
kubectl create -f mysql.yaml
kubectl create -f tomcat.ysml
(5)验证启动状态
root@master:/home/docker/xu/test# kubectl get pods
NAME READY STATUS RESTARTS AGE
hpe-java-web-deployement-4oeax 1/1 Running 0 53m
hpe-java-web-deployement-kqkv8 1/1 Running 0 53m
mysql-deployment-bk5v1 1/1 Running 0 55m
root@master:/home/docker/xu/test# kubectl get service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hpe-java-web 192.168.110.63 8080/TCP 53m
kubernetes 192.168.110.1 443/TCP 1d
mysql 192.168.110.220 3306/TCP 55m
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
<%@page import="java.sql.*" %>
Insert title here
卡号
<%
String driverClass="com.mysql.jdbc.Driver";
String ip=System.getenv("MYSQL_SERVICE_HOST");
String port=System.getenv("MYSQL_SERVICE_PORT");
System.out.println(port+"asasdfasdfasdfasdfasd");
//String ip = "localhost";
//String port = "3306";
Connection conn;
try{
Class.forName(driverClass);
conn = java.sql.DriverManager.getConnection("jdbc:mysql://"+ip+":"+port+"/bms", "root","123456");
Statement stmt=conn.createStatement();
String sql="select * from bms_appuser";
ResultSet rs=stmt.executeQuery(sql);
while(rs.next()){
%>
<%=rs.getString(3) %>
<%
}
}
catch(Exception ex){
ex.printStackTrace();
}
%>
(6)javaweb的工程名称是K8S,为了简单只是含有上述一个jsp页面,index.jsp。数据库、数据库表可以根据自己方便随便创建。
(7)分别在minion1、minion2和除了集群以为的其他机器上访问http://192.168.110.151:31002/K8S/inedex.jsp,如果多能访问到自己的数据,说明实验成功。