❖ 分析容器系统调用:Sysdig
❖ 监控容器运行时:Falco
❖ Kubernetes 审计日志
Sysdig:一个非常强大的系统监控、分析和故障排查工具。
汇聚 strace+tcpdump+htop+iftop+lsof 工具功能于一身!
sysdig 除了能获取系统资源利用率、进程、网络连接、系统调用等信息, 还具备了很强的分析能力,例如:
按照CPU使用率对进程排序
按照数据包对进程排序
打开最多的文件描述符进程
查看进程打开了哪些文件
查看进程的HTTP请求报文
查看机器上容器列表及资源使用情况
项目地址:https://github.com/draios/sysdig
文档:https://github.com/draios/sysdig/wiki
sysdig 通过在内核的驱动模块注册系统调用的 hook,这样当有系 统调用发生和完成的时候,它会把系统调用信息拷贝到特定的 buffer,然后用户态组件对数据信息处理(解压、解析、过滤等), 并最终通过 sysdig 命令行和用户进行交互。
导入sysdig的repo源,安装epel源。
安装完sysdig的时候,需要更新一下软件包,会更新sysdig,才能加载模块。
[root@master01 ~]# rpm --import https://s3.amazonaws.com/download.draios.com/DRAIOS-GPG-KEY.public
[root@master01 ~]# curl -s -o /etc/yum.repos.d/draios.repo https://s3.amazonaws.com/download.draios.com/stable/rpm/draios.repo
[root@master01 ~]# yum install epel-release -y
[root@master01 ~]# yum install sysdig -y
[root@master01 ~]# yum update -y
[root@master01 ~]# /usr/bin/sysdig-probe-loader # 加载驱动模块
-l, --list
:列出可用于过滤和输出的字段
-M
:多少秒后停止收集
-p
, --print= :指定打印事件时使用的格式
-c
:指定内置工具,可直接完成具体的数据聚合、分析工作
-w
:保存到文件中
-r
:从文件中读取
执行sysdig命令,实时输出大量系统调用。
示例:59509 23:59:19.023099531 0 kubelet (1738) < epoll_ctl
格式:%evt.num %evt.outputtime %evt.cpu %proc.name (%thread.tid) %evt.dir %evt.type %evt.info
evt.num
: 递增的事件号
evt.time
: 事件发生的时间
evt.cpu
: 事件被捕获时所在的 CPU,也就是系统调用是在哪个 CPU 执行的
proc.name
: 生成事件的进程名字
thread.tid
: 线程的 id,如果是单线程的程序,这也是进程的 pid
evt.dir
: 事件的方向(direction),> 代表进入事件,< 代表退出事件
evt.type
: 事件的名称,比如 open、stat等,一般是系统调用
evt.args
: 事件的参数。如果是系统调用,这些对应着系统调用的参数
自定义格式输出:
sysdig -p "user:%user.name time:%evt.time proc_name:%proc.name"
查看完整过滤器列表:sysdig -l
示例:
1、查看一个进程的系统调用
sysdig proc.name=kubelet
2、查看建立TCP连接的事件
sysdig evt.type=accept
3、查看/etc目录下打开的文件描述符
sysdig fd.name contains /etc
4、查看容器的系统调用
sysdig -M 10 container.name=web
注:还支持运算操作符,= 、!=、>=、>、<、 <=、contains、in 、exists、and、or、not
Chisels: 实用的工具箱,一组预定义的功能集合,用来分析特定的场景。
sysdig –cl 列出所有Chisels,以下是一些常用的:
sysdig -c netstat
sysdig -c ps
sysdig -c lsof
网络 | # 查看使用网络的进程 TOP sysdig -c topprocs_net # 查看建立连接的端口 sysdig -c fdcount_by fd.sport “evt.type=accept” -M 10 # 查看建立连接的端口 sysdig -c fdbytes_by fd.sport # 查看建立连接的IP sysdig -c fdcount_by fd.cip “evt.type=accept” -M 10 # 查看建立连接的IP sysdig -c fdbytes_by fd.cip |
硬盘 | # 查看进程磁盘I/O读写 sysdig -c topprocs_file # 查看进程打开的文件描述符数量 sysdig -c fdcount_by proc.name “fd.type=file” -M 10 # 查看读写磁盘文件 sysdig -c topfiles_bytes sysdig -c topfiles_bytes proc.name=etcd # 查看/tmp目录读写磁盘活动文件 sysdig -c fdbytes_by fd.filename “fd.directory=/tmp/” |
CPU | # 查看CPU使用率TOP sysdig -c topprocs_cpu # 查看容器CPU使用率TOP sysdig -pc -c topprocs_cpu container.name=web sysdig -pc -c topprocs_cpu container.id=web |
容器 | # 查看机器上容器列表及资源使用情况 csysdig –vcontainers # 查看容器资源使用TOP sysdig -c topcontainers_cpu/topcontainers_net/topcontainers_file |
Falco 是一个 Linux 安全工具,它使用系统调用来保护和监控系统。 Falco最初是由Sysdig开发的,后来加入CNCF孵化器,成为首个加入CNCF的运行时安全项目。
Falco提供了一组默认规则,可以监控内核态的异常行为,例如:
项目地址:https://github.com/falcosecurity/falco
falco配置文件目录:/etc/falco
falco.yaml
falco配置与输出告警通知方式falco_rules.yaml
规则文件,默认已经定义很多威胁场景falco_rules.local.yaml
自定义扩展规则文件k8s_audit_rules.yaml
K8s审计日志规则安装文档:https://falco.org/zh/docs/installation
[root@master01 ~]# rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc
[root@master01 ~]# curl -s -o /etc/yum.repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo
[root@master01 ~]# yum install epel-release -y
[root@master01 ~]# yum update
[root@master01 ~]# yum install falco -y
[root@master01 ~]# systemctl start falco
[root@master01 ~]# systemctl enable falco
[root@master01 ~]# touch abc /usr/sbin/
[root@master01 ~]# tail -f /var/log/messages
Apr 15 02:04:03 master01 falco: 02:04:03.052503851: Error File below /etc opened for writing (user=root user_loginuid=0 command=touch abc /usr/sbin/ parent=bash pcmdline=bash file=/etc/falco/abc program=touch gparent=sshd ggparent=sshd gggparent=systemd container_id=host image=<NA>)
告警规则示例(falco_rules.local.yaml):
[root@master01 ~]# vim falco_rules.local.yaml
- rule: The program "sudo" is run in a container
desc: An event will trigger every time you run sudo in a container
condition: evt.type = execve and evt.dir=< and container.id != host and proc.name = sudo
output: "Sudo run in container (user=%user.name %container.info parent=%proc.pname cmdline=%proc.cmdline)"
priority: ERROR
tags: [users, container]
rule
:规则名称,唯一desc
:规则的描述condition
: 条件表达式output
:符合条件事件的输出格式priority
:告警的优先级tags
:本条规则的 tags 分类1、监控系统二进制文件目录读写(默认规则)
2、监控根目录或者/root目录写入文件(默认规则)
3、监控运行交互式Shell的容器(默认规则)
4、监控容器创建的不可信任进程(自定义规则)
验证:tail -f /var/log/messages(告警通知默认输出到标准输出和系统日志)
监控容器创建的不可信任进程规则
在falco_rules.local.yaml文件添加:
condition表达式解读:
spawned_process
运行新进程container
容器container.image startswith nginx
以nginx开头的容器镜像not proc.name in (nginx)
不属于nginx的进程名称(允许进程名称列表)重启falco应用新配置文件:
systemctl restart falco
[root@master01 falco]# vim falco_rules.local.yaml
- rule: Unauthorized process on nginx containers
condition: spawned_process and container and container.image startswith nginx and not proc.name in (nginx)
desc: test
output: "Unauthorized process on nginx containers (user=%user.name container_name=%container.name container_id=%container.id image=%container.image.repository shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty)"
priority: WARNING
tags: [container]
[root@master01 falco]# systemctl restart falco
[root@master01 falco]# journalctl -u falco -f
-- Logs begin at Thu 2022-04-14 14:39:00 UTC. --
Apr 15 02:38:05 master01 systemd[1]: Stopped Falco: Container Native Runtime Security.
Apr 15 02:38:05 master01 systemd[1]: Starting Falco: Container Native Runtime Security...
Apr 15 02:38:05 master01 systemd[1]: Started Falco: Container Native Runtime Security.
Apr 15 02:38:05 master01 falco[116163]: Falco version 0.31.1 (driver version b7eb0dd65226a8dc254d228c8d950d07bf3521d2)
Apr 15 02:38:05 master01 falco[116163]: Falco initialized with configuration file /etc/falco/falco.yaml
Apr 15 02:38:05 master01 falco[116163]: Loading rules from file /etc/falco/falco_rules.yaml:
Apr 15 02:38:05 master01 falco[116163]: Loading rules from file /etc/falco/falco_rules.local.yaml:
Apr 15 02:38:06 master01 falco[116163]: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Apr 15 02:38:07 master01 falco[116163]: Starting internal webserver, listening on port 8765
Apr 15 02:38:07 master01 falco[116163]: 02:38:07.523070816: Informational Privileged container started (user=> user_loginuid=0 command=container:8ab8ea9eb331 kube-proxy (id=8ab8ea9eb331) image=registry.aliyuncs.com/google_containers/kube-proxy:v1.22.1)
创建container容器之后触发falco
nginx
的container
的时候输入创建faclo
有输出shell=tail
字段[root@master01 falco]# docker run -d nginx
801e315438d5e62c0d02eff444df9b17a9e7d635061e85037df56be9eaa02c4d
[root@master01 falco]# tail -f /var/log/messages
Apr 15 02:39:34 master01 falco: 02:39:34.313610957: Warning Unauthorized process on nginx containers (user=root container_name=cool_heisenberg container_id=801e315438d5 image=nginx shell=md5sum parent=10-listen-on-ip cmdline=md5sum -c - terminal=0)
Apr 15 02:39:34 master01 falco: 02:39:34.319772871: Warning Unauthorized process on nginx containers (user=root container_name=cool_heisenberg container_id=801e315438d5 image=nginx shell=sed parent=10-listen-on-ip cmdline=sed -i -E s,listen 80;,listen 80;\n listen [::]:80;, /etc/nginx/conf.d/default.conf terminal=0)
[root@master01 falco]# docker exec -it 801e315438d5 tail /var/log/nginx/access.log
[root@master01 ~]# tail -f /var/log/messages
Apr 15 02:43:33 master01 falco: 02:43:33.449344235: Warning Unauthorized process on nginx containers (user=root container_name=cool_heisenberg container_id=801e315438d5 image=nginx shell=tail parent=runc cmdline=tail /var/log/nginx/access.log terminal=34816)
Falco支持五种输出告警通知的方式:
告警配置文件:/etc/falco/falco.yaml
例如输出到指定文件,使用json格式:
[root@master01 falco]# vim falco.yaml
json_output: true
···
syslog_output:
enabled: false
···
file_output:
enabled: true
keep_alive: true
filename: /var/log/falco_events.log
···
stdout_output:
enabled: false
[root@master01 falco]# systemctl restart falco
• FalcoSideKick:一个集中收集并指定输出,支持大量方式输出,例如Influxdb、Elasticsearch等
项目地址 https://github.com/falcosecurity/falcosidekick
• FalcoSideKick-UI:告警通知集中图形展示系统
项目地址 https://github.com/falcosecurity/falcosidekick-ui
[root@master01 falco]# docker run -d \
-p 2801:2801 \
--name falcosidekick \
-e WEBUI_URL=http://10.11.121.118:2802 \
falcosecurity/falcosidekick
[root@master01 falco]# docker run -d \
-p 2802:2802 \
--name falcosidekick-ui \
falcosecurity/falcosidekick-ui
[root@master01 falco]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9663f92ade3a falcosecurity/falcosidekick-ui "./falcosidekick-ui" 6 minutes ago Up 6 minutes 0.0.0.0:2802->2802/tcp, :::2802->2802/tcp falcosidekick-ui
80c5d162842f falcosecurity/falcosidekick "./falcosidekick" 7 minutes ago Up 7 minutes 0.0.0.0:2801->2801/tcp, :::2801->2801/tcp falcosidekick
修改falco配置文件指定http方式输出:
[root@master01 falco]# vim falco.yaml
json_output: true
json_include_output_property: true
http_output:
enabled: true
url: "http://10.11.121.118:2801/"
[root@master01 falco]# systemctl restart falco
UI访问地址:http://10.11.121.118:2802/ui/
在Kubernetes集群中,API Server的审计日志记录了哪些用户、哪些服务请求操作集群资源,并且可以编写不同规则, 控制忽略、存储的操作日志。 审计日志采用JSON格式输出,每条日志都包含丰富的元数据,例如请求的URL、HTTP方法、客户端来源等,你可以使 用监控服务来分析API流量,以检测趋势或可能存在的安全隐患。
这些可能服务会访问API Server:
当客户端向 API Server发出请求时,该请求将经历一个或多个阶段:
阶段 | 说明 |
---|---|
RequestReceived | 审核处理程序已收到请求 |
ResponseStarted | 已发送响应标头,但尚未发送响应正文 |
ResponseComplete | 响应正文已完成,不再发送任何字节 |
Panic | 内部服务器出错,请求未完成 |
Kubernetes审核策略文件包含一系列规则,描述了记录日志的级别, 采集哪些日志,不采集哪些日志。 规则级别如下表所示:
级别 | 说明 |
---|---|
None | 不为事件创建日志条目 |
Metadata | 创建日志条目。包括元数据,但不包括请求正文或响应正文 |
Request | 创建日志条目。包括元数据和请求正文,但不包括响应正文 |
RequestResponse | 创建日志条目。包括元数据、请求正文和响应正文 |
参考资料:https://kubernetes.io/zh/docs/tasks/debug-application-cluster/audit/
审计日志支持写入本地文件和Webhook(发送到外部HTTP API)两种方式。
配置流程:
启用审计日志功能:
注:需要使用hostpath数据卷将宿主 机策略文件和日志文件挂载到容器中
在/etc/kubernetes/
下创建audit
目录
配置文件为audit-policy.yaml
,准入控制器需要挂载该目录的文件。
[root@master01 ~]# mkdir /etc/kubernetes/audit && cd /etc/kubernetes/audit
[root@master01 audit]# cat audit-policy.yaml
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: RequestResponse
resources:
- group: ""
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
namespaces: ["kube-system"]
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
- level: Request
resources:
- group: "" # core API group
- group: "extensions"
- level: Metadata
omitStages:
- "RequestReceived"
审计日志支持写入本地文件和Webhook(发送到外部HTTP API)两种方式。
启用审计日志功能:
[root@master01 ~]# vi /etc/kubernetes/manifests/kube-apiserver.yaml
…
- --audit-policy-file=/etc/kubernetes/audit/audit-policy.yaml
- --audit-log-path=/var/log/k8s_audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
...
volumeMounts:
- mountPath: /etc/kubernetes/audit/audit-policy.yaml
name: audit
- mountPath: /var/log/k8s_audit.log
name: audit-log
volumes:
- name: audit
hostPath:
path: /etc/kubernetes/audit/audit-policy.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/k8s_audit.log
type: FileOrCreate
[root@master01 ~]# systemctl restart kubelet
测试查看当前的资源操作日志:
[root@master01 ~]# cat /var/log/k8s_audit.log
···
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"a2fb8bdd-1289-4045-b0f4-ccbafe4416c8","stage":"ResponseComplete","requestURI":"/apis/discovery.k8s.io/v1/namespaces/default/endpointslices/kubernetes","verb":"get","user":{"username":"system:apiserver","uid":"bb28e427-363f-4841-9925-7ae0f7af6c1d","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.22.1 (linux/amd64) kubernetes/632ed30","objectRef":{"resource":"endpointslices","namespace":"default","name":"kubernetes","apiGroup":"discovery.k8s.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2022-04-14T16:13:56.054960Z","stageTimestamp":"2022-04-14T16:13:56.056415Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"51e01b26-c275-4953-9472-efec1f8b0ca2","stage":"ResponseComplete","requestURI":"/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/master02?timeout=10s","verb":"update","user":{"username":"system:node:master02","groups":["system:nodes","system:authenticated"]},"sourceIPs":["10.11.121.116"],"userAgent":"kubelet/v1.22.1 (linux/amd64) kubernetes/632ed30","objectRef":{"resource":"leases","namespace":"kube-node-lease","name":"master02","uid":"b2b0fc89-4220-48ee-86c9-b08e8c283f94","apiGroup":"coordination.k8s.io","apiVersion":"v1","resourceVersion":"921237"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2022-04-14T16:13:56.316392Z","stageTimestamp":"2022-04-14T16:13:56.324183Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
创建日志条目,包括元数据,但不包括请求正文或响应正文支持资源如下:
[root@master01 ~]# cat /etc/kubernetes/audit/audit-policy.yaml
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: None
users:
- system:apiserver
- system:kube-controller-manager
- system:kube-scheduler
- system:kube-proxy
- kubelet
- level: Metadata
resources:
- group: ""
resources: ["pods"]
- group: "apps"
resources: ["deployments"]
- level: None
[root@master01 ~]# ps -ef | grep kubelet
root 72326 72272 10 16:04 ? 00:04:02 kube-apiserver --advertise-address=10.11.121.118
[root@master01 ~]# kill -9 72326
[root@master01 ~]# systemctl restart kubelet
创建一个Pod查看当前的日志情况、使用jq转换为json查看:
[root@master01 ~]# kubectl run web-test-my-pod --image=nginx --image-pull-policy=IfNotPresent
pod/web created
[root@master01 ~]# cat /var/log/k8s_audit.log | grep web-test-my-pod
[root@master01 ~]# echo '{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"53298e83-3e00-471a-80b0-e7f9fa721be9","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/default/pods?fieldManager=kubectl-run","verb":"create","user":{"username":"kubernetes-admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.11.121.118"],"userAgent":"kubectl/v1.23.5 (linux/amd64) kubernetes/c285e78","objectRef":{"resource":"pods","namespace":"default","name":"web-test-my-pod","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":201},"requestReceivedTimestamp":"2022-04-14T16:51:26.948369Z","stageTimestamp":"2022-04-14T16:51:26.959232Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"","imagepolicywebhook.image-policy.k8s.io/failed-open":"true"}}' | jq .
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "53298e83-3e00-471a-80b0-e7f9fa721be9",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-run",
"verb": "create",
"user": {
"username": "kubernetes-admin",
"groups": [
"system:masters",
"system:authenticated"
]
},
"sourceIPs": [
"10.11.121.118"
],
"userAgent": "kubectl/v1.23.5 (linux/amd64) kubernetes/c285e78",
"objectRef": {
"resource": "pods",
"namespace": "default",
"name": "web-test-my-pod",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 201
},
"requestReceivedTimestamp": "2022-04-14T16:51:26.948369Z",
"stageTimestamp": "2022-04-14T16:51:26.959232Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": "",
"imagepolicywebhook.image-policy.k8s.io/failed-open": "true"
}
}
编辑和扩展基本策略到日志:
[root@master01 ~]# cat /etc/kubernetes/audit/audit-policy.yaml
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: RequestResponse
resources:
- group: ""
resources: ["namespaces"]
- level: Request
resources:
- group: ""
resources: ["persistentvolumes"]
namespaces: ["front-apps"]
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: Metadata
omitStages:
- "RequestReceived"
练习
编辑和扩展基本策略到日志:
[root@master01 ~]# cat /etc/kubernetes/audit/audit-policy.yaml
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: RequestResponse
resources:
- group: ""
resources: ["namespaces"]
- level: Request
resources:
- group: ""
resources: ["persistentvolumes"]
namespaces: ["front-apps"]
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: Metadata
omitStages:
- "RequestReceived"