tig 使用_使用TIG监控机器

在15分钟内监控您自己的主机 (Monitor Your Own Host Within 15 Minutes)

(This story/howto/tutorial assumes you have basic knowledge of linux, the command line and how computers work)


氩弧焊 (TIG)

You may have heard of the “TICK” stack: Telegraf, InfluxDB, Chronograf, Kapacitor. I heavily use Telegraf and InfluxDB, but as visualization frontend, Grafana has become the standard in the environments I work in. Kapacitor can still be used alongside for specific purposes. For this article it’s not needed.

您可能听说过“ TICK”堆栈:Telegraf,InfluxDB,Chronograf,Kapacitor。 我大量使用Telegraf和InfluxDB,但作为可视化前端,Grafana已成为我工作环境中的标准。Kapacitor仍可用于特定目的。 对于本文而言,它是不需要的。

Quite a few years ago — out of curiosity and desire to learn and grow — I started playing with Telegraf and InfluxDB. I got hold of a few older machines (dual cores) on which I installed linux for my two kids. And since I wanted to know more about monitoring, I decided to monitor them (the linux machines ofc!) with Telegraf (a small data collection tool written in golang).

几年前,出于好奇和学习和成长的渴望,我开始与Telegraf和InfluxDB一起玩。 我掌握了一些较旧的机器(双核),并为两个孩子安装了linux。 并且由于我想了解更多有关监视的信息,因此我决定使用Telegraf(一种用golang编写的小型数据收集工具)监视它们(Linux机器ofc!)。

So before we can monitor anything we need to setup the target database and the frontend to see something.


设置InfluxDB和Grafana (Setup InfluxDB And Grafana)

Since we want to focus more on gathering data, we’ll do this real quick.


Take the below docker-compose.yml file and launch it on a host of your preference. For playing around with it, I recommend launching it on your local machine. If you have a dedicated or virtual linux box, that will do fine too.

取得以下docker-compose.yml文件,并根据您的喜好启动它。 要使用它, 建议您在本地计算机上启动它 。 如果您有专用的或虚拟的linux机器,那也可以。

macOS前提条件: (macOS Prerequisites:)

  • install homebrew : https://brew.sh/

    安装homebrew: https : //brew.sh/

  • install docker : brew cask install docker

    安装docker: brew cask install docker

Linux先决条件: (Linux Prerequisites:)

  • install docker : https://docs.docker.com/get-docker/

    安装docker: https : //docs.docker.com/get-docker/

Windows先决条件: (Windows Prerequisites:)

  • install docker : https://docs.docker.com/get-docker/

    安装docker: https : //docs.docker.com/get-docker/

Docker撰写 (Docker Compose)

Download the following and save it as docker-compose.yml file in a new directory.


# using version 2 because of https://docs.docker.com/compose/compose-file/#resources
version: "2.4"
    image: drpsychick/telegraf
    restart: always
      HOST_PROC: /rootfs/proc
      HOST_SYS: /rootfs/sys
      HOST_ETC: /rootfs/etc
      TEL_AGENT_HOSTNAME: hostname = "myhostname"
      TEL_OUTPUTS_INFLUXDB_0: "[[outputs.influxdb]]"
      TEL_OUTPUTS_INFLUXDB_URLS: urls = ["http://localhost:8086"]
      TEL_INPUTS_KERNEL_0: "[[inputs.kernel]]"
      TEL_INPUTS_MEM_0: "[[inputs.mem]]"
      TEL_INPUTS_SWAP_0: "[[inputs.swap]]"
      TEL_INPUTS_PROCESSES_0: "[[inputs.processes]]"
      TEL_INPUTS_SYSTEM_0: "[[inputs.system]]"
      TEL_INPUTS_DISK_0: "[[inputs.disk]]"
      TEL_INPUTS_DISKIO_0: "[[inputs.diskio]]"
      TEL_INPUTS_NET_0: "[[inputs.net]]"
      TEL_INPUTS_NETSTAT_0: "[[inputs.netstat]]"
      - "/proc:/rootfs/proc:ro"
      - "/sys:/rootfs/sys:ro"
      - "/etc:/rootfs/etc:ro"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    cpu_percent: 1
    mem_limit: 50m
    network_mode: "host"
    image: drpsychick/influxdb
    restart: always
      IFX_GLOBAL: reporting-disabled = true
      - 8086:8086
      - 8089:8089/udp
      - "influxdb:/var/lib/influxdb"
    cpu_percent: 50
    mem_limit: 1g
      - monitoring
    image: grafana/grafana
    restart: always
      #GF_INSTALL_PLUGINS: # plugins to install
      - 3000:3000
      - "grafana:/var/lib/grafana"
    cpu_percent: 1
    mem_limit: 200m
      - monitoring



它能做什么 (What It Does)

It downloads and starts an InfluxDB as well as a Grafana docker image. To get some data, it will also start a Telegraf service in docker that collects data from your host and sends it to your own, local InfluxDB instance.

它会下载并启动一个InfluxDB以及一个Grafana docker映像。 为了获取一些数据,它还将在docker中启动Telegraf服务,该服务从您的主机收集数据并将其发送到您自己的本地InfluxDB实例。

If you want to read more on Docker and why it’s good to learn how to use it:


Run it with the following command in the directory where your docker-compose.yml file is stored — it will show how it is downloading and starting the docker services:


host% docker-compose up
Creating network "XXX_monitoring" with the default driver
Creating network "XXX_influxdb" with the default driver
Creating network "XXX_grafana" with the default driver

You will notice that the command does not return and will continuously print the logs from all three services. Here you can see if any errors pop up.

您会注意到该命令不会返回,并将连续打印所有三个服务的日志。 在这里,您可以查看是否弹出任何错误。

(Play With It)

When it’s up and running, open http://localhost:3000/ and you should see a Grafana login screen. Login with admin as username and admin as password. You are now asked to set a new password.

当它启动并运行时,打开http:// localhost:3000 / ,您应该会看到Grafana登录屏幕。 使用admin作为用户名和admin作为密码登录。 现在要求您设置一个新密码。

只需2个简单步骤即可进行设置: (Set it up in 2 simple steps:)

  1. Add A Datasource

  2. Create a Dashboard


添加数据源 (Add A Datasource)

Navigate in the left side menu to Configuration and then Data Sources .

在左侧菜单中导航至“ Configuration ,然后导航至“ Data Sources

Select the InfluxDB data source and enter the following values:


  • URL: http://influxdb:8086 (this points to your local influxdb docker container)

    URL: http://influxdb:8086 (这指向您的本地influxdb docker容器)

  • Database: telegraf (this is the default database that telegraf writes to)

    数据库: telegraf (这是Telegraf写入的默认数据库)

  • Click the Save & Test button, it should show a green message.

    单击“ Save & Test按钮,它应显示绿色消息。

Congratulations! You have successfully added a data source.

恭喜你! 您已成功添加数据源。

创建您的第一个仪表板 (Create Your First Dashboard)

In the left menu, click the + and then on Dashboard, then click the Add new panel button.

在左侧菜单中,单击+ ,然后在“ Dashboard ,然后单击“ Add new panel按钮。

On the panel screen you now create your query to the influxdb database. It’s simple drag and drop as the possible values are fetched from the influxdb.

现在,您在面板屏幕上创建对influxdb数据库的查询。 这是简单的拖放操作,因为可以从influxdb中获取可能的值。

Click on select measurement (that is how “tables” are called in InfluxDB) and you will see cpu as an option.

单击select measurement (这是InfluxDB中“表”的调用方式),您将看到cpu作为选项。

Now configure the query, so that you see what you want to see.


On the left is an example of a working configuration. It even has a GROUP BY for the cpu cores, so you should see multiple series, one for each cpu core.

左侧是工作配置的示例。 它甚至对CPU内核都有一个GROUP BY ,因此您应该看到多个系列,每个CPU内核一个。

You can also import existing dashboards directly from grafana.com that will save you a lot of time setting up useful dashboards. For example, for a simple host dashboard, this one might suffice: https://grafana.com/grafana/dashboards/10581

您也可以直接从grafana.com 导入现有的仪表板 ,这样可以节省大量时间来设置有用的仪表板。 例如,对于一个简单的主机仪表板来说,这个可能就足够了: https : //grafana.com/grafana/dashboards/10581

Also, the Grafana getting started guide may help you get your way around.

另外,《 Grafana入门指南》可能会帮助您解决问题。

恭喜你! (Congratulations!)

You now have all 3 components running that are needed to monitor host data.


You can stop them temporarily with docker-compose stop and restart them with docker-compose start . Data will be persisted in docker volumes, so your dashboards and login data for Grafana remains as well as the data in the InfluxDB.

您可以使用docker-compose stop暂时停止它们,并使用docker-compose start重启它们。 数据将保留在Docker卷中,因此Grafana的仪表板和登录数据以及InfluxDB中的数据将保留。

监控更多 (Monitor More)

You can monitor many more basic values and also use plenty of input plugins which are already available with Telegraf. To see how you can configure the Telegraf container through environment variables, check the README.md.

您可以监视更多基本值,还可以使用Telegraf已提供的大量输入插件。 要查看如何通过环境变量配置Telegraf容器,请检查README.md 。

If you get to the limits of what is practically configureable through environment variables, you can create your own telegraf.conf configuration file and mount it into a telegraf container. Just add this mount instruction to the docker-compose.yml for Telegraf:

如果达到可以通过环境变量实际配置的限制,则可以创建自己的telegraf.conf配置文件并将其安装到telegraf容器中。 只需将此安装指令添加到Telegraf docker-compose.yml中:

- "/my/directory/telegraf.conf:/etc/telegraf/telegraf.conf:ro"

To collect stats from multiple hosts, you probably want a designated machine running the InfluxDB 24/7. The same machine could serve as the Grafana host, so you can access it from anywhere you want. In my case it is a server on the internet. But your local NAS can serve that purpose as well, if you have one that can run docker containers.

要从多个主机收集统计信息,您可能需要一台运行InfluxDB 24/7的指定计算机。 同一台机器可以用作Grafana主机,因此您可以从任何位置访问它。 就我而言,它是互联网上的服务器。 但是,如果您的本地NAS可以运行docker容器,那么您的本地NAS也可以实现此目的。

I’m not going into detail here on how to set that up, but I will mention below, how to install and run Telegraf on different machines.


监控Linux主机 (Monitor linux hosts)

On Linux I prefer to run it in a docker container, simply because I don’t need to install much on the machine then.


Install: docker create [..options..] --name telegraf drpsychick/telegraf Configuration: through environment variables either from file --env-file or directly --envRun service: docker start telegraf

安装docker create [..options..] --name telegraf drpsychick/telegraf 配置 :通过文件--env-file或直接--env环境变量运行服务: docker start telegraf

Of course you can use other images or the official telegraf image from InfluxData.

当然,您可以使用其他图像或InfluxData提供的官方 telegraf 图像 。

监控macOS主机 (Monitor macOS hosts)

Install: brew install telegrafConfiguration: /usr/local/etc/telegraf.confRun service: brew services restart telegraf

安装brew install telegraf 配置/usr/local/etc/telegraf.conf 运行服务brew services restart telegraf

More macOS monitoring here


监视Windows主机 (Monitor Windows hosts)

Install: Download release and install to C:\Program Files\telegraf Configuration: C:\Program Files\telegraf\telegraf.conf (hint, you probably want to configure the [[inputs.win_perf_counters]])Run service: telegraf.exe --service install once, then net start telegraf

安装 : 下载发行版并安装到C:\Program Files\telegraf 配置C:\Program Files\telegraf\telegraf.conf (提示,您可能希望配置[[inputs.win_perf_counters]] ) 运行服务telegraf.exe --service install一次,然后net start telegraf

Sometimes on Windows, telegraf no longer collects data. To fix that you have to run lodctr /r as Administrator to refresh the performance monitors.

有时在Windows上,telegraf不再收集数据。 要解决此问题,您必须以管理员身份运行lodctr /r来刷新性能监视器。

撤消一切 (Undo Everything)

One of the nice things about docker-compose: one simple command and everything you have done (downloaded services, volumes created, data gathered …) is gone:

关于docker-compose的一件好事:一个简单的命令以及您已完成的一切 (下载的服务,创建的卷,收集的数据……)都消失了:

docker-compose down -v

对我有什么好处? (What’s In It For Me?)

To conclude, you may wonder why I do this, what’s the benefit? Is it worth all the effort?

总之,您可能想知道为什么我这样做,这样做有什么好处? 值得所有的努力吗?

It depends. First of all, you have to decide that for yourself. I used it to learn more on monitoring, docker, influxdb, grafana and telegraf. I’m sure that’s not a good enough reason for everyone else to use it. Over time it has also revealed a few other benefits to me:

这取决于。 首先,您必须自己决定。 我用它来了解有关监视,docker,influxdb,grafana和telegraf的更多信息。 我敢肯定,这不是让所有人都使用它的充分理由。 随着时间的流逝,它还给我带来了其他好处:

  1. Troubleshooting: I can quickly check the stats of my machines in the network, to see where or what is wrong. It saves time.

    故障排除:我可以快速检查网络中计算机的统计信息,以查看错误之处或原因。 节省时间。
  2. Disk alerts: I’ve setup alerts in grafana that inform me when a disk becomes full. You can easily setup your own alerts in grafana.

    磁盘警报:我在grafana中设置了警报,当磁盘已满时会通知我。 您可以在grafana中轻松设置自己的警报。
  3. Monitoring my internet connection: This was probably the biggest benefit for me. We did have a crappy DSL line and I had graphs of how much of the bandwidth is used and so forth. Now we have fiber and I don’t have to worry about the bandwidth anymore, but I still like the graphs

    监视我的Internet连接:这可能对我来说是最大的好处。 我们确实有一条app脚的DSL线路,我看到了使用了多少带宽的图表,依此类推。 现在我们有了光纤,我不必再担心带宽了,但是我仍然喜欢这些图表
  4. Software metrics: If you write or deploy software, you probably want some application metrics. For example, I send an annotation to my InfluxDB whenever I run a deployment to make it visible in graphs.

    软件指标:如果编写或部署软件,则可能需要一些应用程序指标。 例如,每当我运行一个部署使其在图形中可见时,我都会向InfluxDB发送一个注释。

示例图 (Examples Graphs)

Stats from a MacBook Pro MacBook Pro的统计数据
Stats from a Windows 10 machine during and after gaming 游戏期间和之后Windows 10计算机的统计信息

翻译自: https://medium.com/dev-genius/monitoring-your-machine-s-with-tig-a9ef39cd0eec

