beyla源码:golang程序的trace context propagation

beyla支持通过ebpf,自动采集应用程序的trace信息。

对于golang程序,beyla还支持trace context progagation,即微服务之间的trace上下文传播,这样服务之间调用的链条就连起来了,达到了普通的侵入式tracing同样的效果。

以golang的nethttp为例,讲述beyla对trace context propagation的实现原理。

一. 整体原理

Trace context propagation会监听对外部服务的HTTP调用,在HTTP header中增加traceparent字段。

通过http header中traceparent字段,实现了trace context propagation。

在golang应用中,对外部服务的Http调用,会依次调用:

  • net/http.(*Transport).roundTrip
  • net/http.Header.writeSubset

beyla源码:golang程序的trace context propagation_第1张图片

监听roundTrip时:

  • 以key=goroutine_addr, value=trace,写入ongoing_http_client_requests对象;
  • 以key=header_addr,value=goroutine_addr,写入header_req_map对象;

监听header_writeSubset时:

  • 根据header_addr,查找header_req_map对象,找到goroutine_addr;
  • 根据goroutine_addr,查找ongoing_http_client_requests对象,找到trace信息;
  • 最后,使用bpf辅助函数:bpf_probe_write_user(),将trace信息中的traceparent写入http的header;

二. 监听uprobe/roundTrip

处理流程:

  • 首先,将goroutine及其trace信息,写入ongoing_http_client_requests对象;
  • 然后:

    • 若当前request header中没有traceparent,则将key=header_addr,value=goroutine_addr写入header_req_map对象;
    • 若当前request header中有traceparent,则不需要做什么,因为http header中已经有trace信息了;
// bpf/go_nethttp.c

SEC("uprobe/roundTrip")
int uprobe_roundTrip(struct pt_regs *ctx) {
    roundTripStartHelper(ctx);
    return 0;
}
// bpf/go_nethttp.c

/* HTTP Client. We expect to see HTTP client in both HTTP server and gRPC server calls.*/
static __always_inline void roundTripStartHelper(struct pt_regs *ctx) {
    ....
    // 将gorouinte及其trace信息,写入ongoing_http_client_requests对象
    if (bpf_map_update_elem(&ongoing_http_client_requests, &goroutine_addr, &invocation, BPF_ANY)) {
        bpf_dbg_printk("can't update http client map element");
    }
// 若支持header propagation
#ifndef NO_HEADER_PROPAGATION
    if (!existing_tp) { // request中没有traceparent
        void *headers_ptr = 0;
        bpf_probe_read(&headers_ptr, sizeof(headers_ptr), (void*)(req + req_header_ptr_pos));
        bpf_dbg_printk("goroutine_addr %lx, req ptr %llx, headers_ptr %llx", goroutine_addr, req, headers_ptr);
        
        if (headers_ptr) {
            bpf_map_update_elem(&header_req_map, &headers_ptr, &goroutine_addr, BPF_ANY);    // 写入header_req_map对象
        }
    }
#endif
}

header_req_map对象的定义:

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __type(key, void *); // key: pointer to the request header map
    __type(value, u64); // the goroutine of the transport request
    __uint(max_entries, MAX_CONCURRENT_REQUESTS);
} header_req_map SEC(".maps");

三. 监听uprobe/header_writeSubset

处理流程:

  • 首先,根据header_addr,查找header_req_map对象,得到goroutine_addr;
  • 然后,根据goroutine_addr,查找ongoing_http_client_requests对象,得到trace信息;
  • 最后,将trace信息,组装给traceparent,通过bpf_probe_write_user()函数,写入http header中;
// beyla/go_nethttp.c

#ifndef NO_HEADER_PROPAGATION
// Context propagation through HTTP headers
SEC("uprobe/header_writeSubset")
int uprobe_writeSubset(struct pt_regs *ctx) {
    void *header_addr = GO_PARAM1(ctx);
    void *io_writer_addr = GO_PARAM3(ctx);
    // 首先,根据header_addr,查找header_req_map对象,得到goroutine_addr
    u64 *request_goaddr = bpf_map_lookup_elem(&header_req_map, &header_addr);
    // 然后,根据goroutine_addr,查找ongoing_http_client_requests对象,得到trace信息
    u64 parent_goaddr = *request_goaddr;
    http_func_invocation_t *func_inv = bpf_map_lookup_elem(&ongoing_http_client_requests, &parent_goaddr);
    ...
    unsigned char buf[TRACEPARENT_LEN];
    make_tp_string(buf, &func_inv->tp); // trace写入buf
    ...
    // 最后,使用bpf_probe_write_user()函数将buf中的信息写入http header
    if (len < (size - TP_MAX_VAL_LENGTH - TP_MAX_KEY_LENGTH - 4)) { // 4 = strlen(":_") + strlen("\r\n")
        char key[TP_MAX_KEY_LENGTH + 2] = "Traceparent: ";
        char end[2] = "\r\n";
        bpf_probe_write_user(buf_ptr + (len & 0x0ffff), key, sizeof(key));
        len += TP_MAX_KEY_LENGTH + 2;
        bpf_probe_write_user(buf_ptr + (len & 0x0ffff), buf, sizeof(buf));
        len += TP_MAX_VAL_LENGTH;
        bpf_probe_write_user(buf_ptr + (len & 0x0ffff), end, sizeof(end));
        len += 2;
        bpf_probe_write_user((void *)(io_writer_addr + io_writer_n_pos), &len, sizeof(len));
    }
    return 0;
}
#else
...
#endif

对于nethttp,这里最终写入的是bufio.Write的buf字段:

// go/src/bufio/bufio.go

type Writer struct {
    err error
    buf []byte
    n   int
    wr  io.Writer
}

四. bpf辅助函数bpf_probe_write_user

bpf_probe_write_user()由于会修改用户态的内存,对内核有一些要求:

In order to write the traceparent value in outgoing HTTP/gRPC request headers, Beyla needs to write to the process memory using the bpf_probe_write_user eBPF helper.
Since kernel 5.14 (with fixes backported to the 5.10 series) this helper is protected (and unavailable to BPF programs) if the Linux Kernel is running in integrity lockdown mode. Kernel integrity mode is typically enabled by default if the Kernel has Secure Boot enabled, but it can also be enabled manually.

而对于内核lockdown的配置:

Beyla will automatically check if it can use the bpf_probe_write_user helper, and enable context propagation only if it's allowed by the kernel configuration. Verify the Linux Kernel lockdown mode by running the following command:

cat /sys/kernel/security/lockdown

If that file exists and the mode is anything other than [none], Beyla will not be able to perform context propagation and distributed tracing will be disabled.

在代码实现中:

  • 若内核版本<5.*,则支持context propagation;
  • 若内核版本<5.10,则支持context propagation;
  • 否则:

    • 检查读取内核安全锁定文件/sys/kernel/security/lockdown,查看是否启用内核安全锁定;
    • 若未启动内核安全锁定(KernelLockDownNone),则支持;
// pkg/internal/ebpf/common/common.go

func SupportsContextPropagation(log *slog.Logger) bool {
    kernelMajor, kernelMinor := KernelVersion()
    if kernelMajor < 5 || (kernelMajor == 5 && kernelMinor < 10) {
        log.Debug("Found Linux kernel earlier than 5.10, trace context propagation is supported", "major", kernelMajor, "minor", kernelMinor)
        return true
    }
    // 读文件/sys/kernel/security/lockdown
    lockdown := KernelLockdownMode()
    // 若内容=none,则返回true
    if lockdown == KernelLockdownNone {
        log.Debug("Kernel not in lockdown mode, trace context propagation is supported.")
        return true
    }
    return false
}

参考

1.https://github.com/grafana/beyla/issues/521
2.https://github.com/grafana/beyla/blob/main/docs/sources/distributed-traces.md

你可能感兴趣的:(goebpfbpftrace)