Jaeger 全链路实战

链路追踪

当今互联网正在往微服务化发展,复杂的模块,繁琐的层级调度。数不清的机器,看不尽的日志,各样的语言开发,形形色色的团队。排查问题难上加难,这就导致了链路追踪的出现。今天来分享下实战

收益&模型

1、一键查询上下游业务
2、链路耗时实时性反馈
Jaeger 全链路实战_第1张图片

服务安装启动

首先去https://www.jaegertracing.io/download/ 下载对应操作系统镜像。
本地测试的话 建议直接运行./jaeger-all-in-one
如果是线上环境建议分开操作

Agent

我先简单说下 Agent是干什么的吧,它主要负责收集程序、监听UDP端口 将数据批量发给collector. 独立部署,方便解耦。那么独立部署自然需要启动进程,下边我是以ES数据落地为准 启动的服务。

export SPAN_STORAGE_TYPE=elasticsearch
nohup  ./jaeger-agent  --collector.host-port=127.0.0.1:14267 --discovery.min-peers=1 --log-level=debug > agent.log 2>&1 &

Collector

接收agent 批量发来的数据,将数据落地到我们选择的存储里面,Collector 本身是无状态的。我们可以运行多个collector.

export SPAN_STORAGE_TYPE=elasticsearch
nohup ./jaeger-collector --es.server-urls=http://127.0.0.1:9200 --es.index-prefix=push --es.username=* --es.password=*  --log-level=debug > collector.log  2>&1 & 

要注意ES一般都是有auth认证的。需要填写对 才能上报成功

UI展示

顾名思义,数据我们既然存储落地了,这就是个查询展示的页面。如下图
Jaeger 全链路实战_第2张图片

export SPAN_STORAGE_TYPE=elasticsearch
nohup ./jaeger-query --span-storage.type=elasticsearch --es.server-urls=ttp://127.0.0.1:9200 --es.username=* --es.password=*  --es.index-prefix=online --es.timeout=10s > query.log 2>&1 &
启动查询命令,注意index-prefix 是你的ES索引前缀,方便聚合查询的。

Http 注册

服务启动时,加载全局trace。

func init() {
    binaryName := strings.Split(os.Args[0], "/")
    statMetric := binaryName[len(binaryName)-1]
    InitTracing(statMetric, 0, 0)
}

func InitTracing(serviceName string, maxIdleConns, maxIdleConnsPerHost int) {
    c := config.Configuration{
        Sampler: &config.SamplerConfig{Type: jaeger.SamplerTypeRemote}, // SamplingServerURL: "http://localhost:5778/sampling"
        Reporter: &config.ReporterConfig{
            LogSpans:            false,
            BufferFlushInterval: 60 * time.Second,
            LocalAgentHostPort:  "127.0.0.1:6831",
        }}
    if closer != nil {
        closer.Close()
    }
    var err error
    closer, err = c.InitGlobalTracer(serviceName)
    if err != nil {
        log.Error(err)
        panic(err)
    }
    DefaultHTTPClient = &http.Client{Transport: NewHTTPTransport(opentracing.GlobalTracer(), maxIdleConns, maxIdleConnsPerHost)}

}

transport 内嵌 OK,搞定

func (t *Transport) RoundTrip(req *http.Request) (*http.Response, error) {
    rt := t.RoundTripper
    if rt == nil {
        rt = http.DefaultTransport
    }
    tracer, ok := req.Context().Value(keyTracer).(*RequestTracer)
    if !ok {
        return rt.RoundTrip(req)
    }

    tracer.start(req)

    ext.HTTPMethod.Set(tracer.sp, req.Method)
    ext.HTTPUrl.Set(tracer.sp, req.URL.String())

    carrier := opentracing.HTTPHeadersCarrier(req.Header)
    tracer.sp.Tracer().Inject(tracer.sp.Context(), opentracing.HTTPHeaders, carrier)
    resp, err := rt.RoundTrip(req)

    if err != nil {
        return resp, err
    }
    ext.HTTPStatusCode.Set(tracer.sp, uint16(resp.StatusCode))
    if req.Method == "HEAD" {
    } else {
        resp.Body = closeTracker{resp.Body, tracer.sp}
    }
    return resp, nil
}
type RequestTracer struct {
    tr opentracing.Tracer
    sp   opentracing.Span
    opts *clientOptions
}

func (h *RequestTracer) start(req *http.Request) opentracing.Span {
    if h.sp != nil {
        return h.sp
    }
    if h.sp == nil {
        parent := opentracing.SpanFromContext(req.Context())
        var spanctx opentracing.SpanContext
        if parent != nil {
            spanctx = parent.Context()
        }
        operationName := h.opts.operationName
        if operationName == "" {
            operationName = "HTTP Client"
        }
        root := h.tr.StartSpan(operationName, opentracing.ChildOf(spanctx))
        h.sp = root
    }

    ext.SpanKindRPCClient.Set(h.sp)

    componentName := h.opts.componentName
    if componentName == "" {
        componentName = "auto/prefix"
    }
    ext.Component.Set(h.sp, componentName)

    return h.sp
}

业务埋点注册

当我们是业务内部的函数/日志需要上报到全链路时候如何做那?

var Trace opentracing.Tracer
var Closer io.Closer

func NewTrace(srv string) {
    Trace, Closer = InitJaeger(srv)
    if Closer != nil {

    }
    opentracing.InitGlobalTracer(Trace)
}

//记录推送 的trace info
type PushTrace struct {
    Ctx      context.Context
    IsRecord bool //是否记录  true 记录
}

func NewPushContext(isRecord bool, span opentracing.Span) (trace PushTrace) {
    if !isRecord {
        trace = PushTrace{IsRecord: isRecord}
    } else {
        trace = PushTrace{
            Ctx:      opentracing.ContextWithSpan(context.Background(), span),
            IsRecord: isRecord,
        }
    }
    return trace
}

//简单追加span
func AppendSpan(trace PushTrace, operationName string, event, val string) {
    if trace.IsRecord {
        span, ctx := opentracing.StartSpanFromContext(trace.Ctx, operationName)
        opentracing.ContextWithSpan(ctx, span)
        span.LogFields(
            opentracinglog.String("event", event),
            opentracinglog.String("value", val),
        )
        defer span.Finish()
    }
}

调用示例


span := Trace.StartSpan("***")
    span.SetTag("push_id","test")
    span.Finish()
    trace := NewPushContext(true, span)
    
    //函数1
    AppendSpan(trace, "test1", "android", "nopush")

如图Jaeger 全链路实战_第3张图片

详细参考jaeger 的规范https://github.com/opentracing/specification/blob/master/semantic_conventions.md

代码地址

https://github.com/xiaowei520/golangx/tree/master/trace

参考文章

https://opentracing.io

你可能感兴趣的:(golang,jaeger,业务)