前言
如何知道 K8S 集群内 Pod 之间建立了哪些 TCP 连接?集群之间存在哪些调用关系?使用 k8spacket 和Grafana,你可以可视化集群中的 TCP 流量。了解工作负载如何相互通信,以及建立了多少连接,交换了多少字节,这些连接处于活动状态的时间。
介绍
k8spacket是用 Golang 编写的工具,它使用gopacket第三方库来嗅探工作负载(传入和传出)上的 TCP 数据包。它在运行的容器网络接口上创建 TCP 侦听器。当 Kubernetes 创建一个新容器时,CNI 插件负责提供与其他容器进行通信的可能性。最常见的方法是用linux namespace隔离网络并用veth pair连接隔离的 namespace 与网桥。除了bridge 类型,CNI 插件还可以使用其他类型(vlan, ipvlan,macvlan),但都为容器创建了一个网络接口,它是k8spacket嗅探器的主要句柄。k8spacket有助于了解 Kubernetes 集群中的 TCP 数据包流量:
显示集群中工作负载之间的流量
通知流量在集群外路由到哪里
显示有关连接关闭套接字的信息
显示工作负载发送/接收的字节数
计算建立连接的时间
显示整个集群中工作负载之间的网络连接拓扑
k8spacket是一个 Kubernetes API 客户端,可以将嗅探到的工作负载解析为可视化上可见的集群资源名称(Pods和Services)。它作为DaemonSet Pod启动,使用 hostNetwork,并监听节点上的网络接口。k8spacket 收集 TCP 流、处理数据,使用 Node Graph API Grafana 数据源插件(详情请查看 Node Graph API 插件),通过 API 展示在Grafana面板。
要安装k8spacket,需要同时安装 Grafana。下面将在Kind安装的 k8s 集群上做演示。
安装 k8spacket
#使用 Helm 安装:
[root@CentOS8 ~]# helm repo add k8spacket https://k8spacket.github.io/k8spacket-helm-chart
"k8spacket" has been added to your repositories
[root@CentOS8 ~]# helm install k8spacket --namespace k8spacket k8spacket/k8spacket --create-namespace
NAME: k8spacket
LAST DEPLOYED: Thu Aug 18 08:37:10 2022
NAMESPACE: k8spacket
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
export NODE_PORT=$(kubectl get --namespace k8spacket -o jsonpath="{.spec.ports[0].nodePort}" services k8spacket)
export NODE_IP=$(kubectl get nodes --namespace k8spacket -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
#默认安装会使用下面的命令获取所有需要监听的网络接口:
ip address | grep @ | sed -E 's/.* (\w+)@.*/\1/' | tr '\n' ',' | sed 's/.$//'
#其中可能包含一些状态为Down的接口,此时启动k8spacket会报错:
2022/08/15 00:17:34 error opening pcap handle: tunl0: That device is not up
#所以需要自定义修改values.yaml中的参数。将charts包拉取到本地,解压之后再修改:
[root@CentOS8 k8spacket]# helm pull k8spacket/k8spacket
[root@CentOS8 k8spacket]# ls
k8spacket-0.1.0.tgz
[root@CentOS8 k8spacket]# tar -zxf k8spacket-0.1.0.tgz
[root@CentOS8 k8spacket]# cd k8spacket
[root@CentOS8 k8spacket]# ls
Chart.yaml docs README.md templates values.yaml
#修改 values.yaml 中的内容,过滤掉tunl0
k8sPacket:
metrics:
## Hide source port when 'true' (set to string value 'dynamic' instead of decimal real source port) for Prometheus metrics cardinality reasons
hideSourcePort: true
reverseLookup:
## Reverse lookup db file based on GeoLite2 Free Geolocation Data
## See: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data?lang=en
geoipDBPath: "/home/k8spacket/GeoLite2-City.mmdb"
## Whois result match regexp
whoisRegexp: "(?:OrgName:|org-name:)\\s*(.*)"
tcp:
listener:
interfaces:
## Command to achieve containers network interfaces
command: "ip address | grep @ |grep -v tun10| sed -E 's/.* (\\w+)@.*/\\1/' | tr '\\n' ',' | sed 's/.$//'"
## How often refresh the list of network interfaces to listen
refreshPeriod: "10s"
assembler:
## See: https://pkg.go.dev/github.com/google/gopacket/tcpassembly#AssemblerOptions
maxPagesPerConnection: 50
maxPagesTotal: 50
## Every (periodDuration) seconds, flush connections that haven't seen activity in the past (closeOlderThanDuration) seconds.
flushing:
periodDuration: "10s"
closeOlderThanDuration: "20s"
#refreshPeriod参数表示多久刷新一次要监听的网络接口列表,增加新的网络接口监听,移除旧网络接口监听。
#每 periodDuration秒,刷新在过去 closeOlderThanDuration秒内没有看到活动的连接。
#安装成功,包含以下Daemonset Pods 和 Service:
[root@CentOS8 k8spacket]# kubectl get pod -n k8spacket -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
k8spacket-428bk 1/1 Running 0 6m8s 10.110.82.178 centos8
[root@CentOS8 k8spacket]# kubectl get svc -n k8spacket -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
k8spacket ClusterIP 10.106.125.144 8080/TCP 19m app.kubernetes.io/instance=k8spacket,app.kubernetes.io/name=k8spacket
#k8spacket Pod 提供了 /metrics 接口暴露指标:
[root@CentOS8 k8spacket]# curl 10.106.125.144:8080/metrics
# HELP go_build_info Build information about the main Go module.
# TYPE go_build_info gauge
go_build_info{checksum="",path="github.com/k8spacket",version="(devel)"} 1
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.6341e-05
go_gc_duration_seconds{quantile="0.25"} 0.000109225
go_gc_duration_seconds{quantile="0.5"} 0.000222992
go_gc_duration_seconds{quantile="0.75"} 0.000482412
go_gc_duration_seconds{quantile="1"} 0.00082685
go_gc_duration_seconds_sum 0.001784635
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 59
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.17.13"} 1
#安装 dashboards
#下载k8spacket项目,并将dashboards目录下的面板 configmaps 创建到 K8S 中:
wget https://github.com/k8spacket/k8spacket/archive/refs/heads/master.zip
unzip master.zip
cd k8spacket-master
kubectl apply --recursive -f ./dashboards
#创建了 k8spacket-logs-dashboard、k8spacket-metrics-dashboard、k8spacket-node-graph-dashboard三个面板。
#其中的metrics面板公开了 Prometheus 指标,这里不做演示。只关心node-graph面板。
#安装 grafana
#使用 Helm 安装 grafana,helm-charts 包地址如下:
https://github.com/grafana/helm-charts
#同样的拉取到本地:
helm repo add grafana https://grafana.github.io/helm-charts
helm fetch grafana/grafana
tar -zxf grafana-6.32.13.tgz
cd grafana/
#charts包版本为:6.32.13
#grafana版本为:9.0.5
#修改values.yaml,将 Node Graph API 插件和数据源,以及 node-graph dashboard configmaps 添加到 Grafana。同时开启数据持久化。例如:
persistence:
type: pvc
enabled: true
storageClassName: openebs-hostpath #默认存储
env:
GF_INSTALL_PLUGINS: hamedkarbasi93-nodegraphapi-datasource
dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default
dashboardsConfigMaps:
default: k8spacket-node-graph-dashboard
datasources:
nodegraphapi-plugin-datasource.yaml:
apiVersion: 1
datasources:
- name: "Node Graph API"
jsonData:
url: "http://k8spacket.k8spacket.svc.cluster.local:8080"
access: "proxy"
basicAuth: false
isDefault: false
readOnly: false
type: "hamedkarbasi93-nodegraphapi-datasource"
typeLogoUrl: "public/plugins/hamedkarbasi93-nodegraphapi-datasource/img/logo.svg"
typeName: "node-graph-plugin"
orgId: 1
version: 1
#在values.yaml目录下执行创建命令:
helm install grafana -f values.yaml ./
[root@CentOS8 grafana]# helm install grafana -f values.yaml ./
NAME: grafana
LAST DEPLOYED: Thu Aug 18 09:51:15 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace default grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.default.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace default port-forward $POD_NAME 3000
3. Login with the password from step 1 and the username: admin
#获取到admin账号的密码:
kubectl get secret --namespace default grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
#开启临时端口转发,使得集群外可以访问grafana实例:
kubectl --namespace default port-forward service/grafana 3000:80 --address 0.0.0.0
通过http://{Kind宿主机IP}:3000打开grafana面板,并使用上面获取到的密码登录,可以看到Node Graph API插件成功安装
在node graph面板可以看到集群中网络连接拓扑
#使用
#统计类型
#connection:帮助了解工作负载之间以及与外部客户端之间建立了多少连接。它会告诉你哪些套接字保持打开状态并可能导致问题。
#bytes:显示工作负载发送或接收的字节数。
#duration:计算连接的生命周期。
#过滤器
#by namespace:选择一个或多个 k8s 命名空间
#by names included:选择工作负载名称进行可视化
#by names excluded:从可视化中排除工作负载名称
摘自微信公众号[奇妙的Linux世界]