Django Rest Framework 通过 Middleware来生成全局唯一的 log-uuid、记录日志、上报 sentry

欢迎到我的个人博客阅读这篇文章,体验更好哦~

需求

drf 写接口,希望在请求成功后,能记录一些信息,以便于追溯问题

  1. 将当前请求的 requestresponse数据记录到本地 log 文件中
  2. 将关键信息上报至 sentry
  3. 为了能更好的查询日志,希望每个请求到响应的日志中,有一个全局唯一的 log-uuid

明确记录的数据

首先来确定下我们需要哪些数据,以便于下一步的设计。

::: warning 注意

应该明白,我的目的仅仅是记录请求-响应这个过程中的数据,而不包含记录业务相关数据。

业务相关数据,还是要到业务代码中进行记录。

:::

  • request
    • uid,用户 uuid,如果是未登录或者认证失败则返回 None
    • ip,用户的远程 ip 地址
    • path,api的地址,重要
    • view,访问的 view 视图函数
    • method,请求方式
    • body,请求的参数,重要
    • headers,请求的 headers,重要
  • response
    • status,http状态码
    • body,响应的 body,重要
    • headers,响应的 headers,重要

其次,我计划将 log-uuid,放在响应的 headers 里面返回给客户端,即 response.headers 里面。

那么,无论是记录本地 log 还是上报 sentry,以上数据基本能满足,不满足的再自己根据实际情况添加即可。

如何记录

View 视图函数中记录

由于要记录每个请求的request 、response信息,我们首先可以想到 view 视图。

APIView

如果你的视图函数是 CBV,继承自 APIView或者是使用 FBV 的话,可以在每个视图函数中,log 记录request和response,例如:

::: details 展开查看代码

class TestView(APIView):
    # CBV 方式

    def post(self, request, *args, **kwargs):
        log.info(request.data)  # 记录请求信息
        resp = {
            'msg': 'ok',
            'code': 20000,
            'data': None
        }
        log.info(resp)   # 记录响应信息
        sentry_sdk.capture_message('something') # 上报 sentry
        return APIResponse(resp)  # APIResponse是自定义的响应类

:::

很显然,有多少个视图函数,就得都在里面加上这两个 log.info(),以及上报 sentry 的代码,那代码太冗余,肯定是不科学的。

ModelViewSet

如果你的视图函数是继承自 DRF 自身的 ModelViewSet,好像就没有办法在视图中记录了

如果你自定义了一个 ModelViewSet,暂且称为 APIModelViewSet,那么可以在 APIModelViewSet 类中实现日志记录,例如:

::: details 展开查看代码

from rest_framework.exceptions import ValidationError
from rest_framework.viewsets import ModelViewSet

from buggg.core.response import APIResponse
from buggg.utils.utils import now


class APIModelViewSet(ModelViewSet):
    """
    自定义ViewSet,主要实现在写数据到model时保存当前操作用户,及返回response时使用自定义的APIResponse类、记录日志
    """

    def list(self, request, *args, **kwargs):
        log.info('record request')
        queryset = self.filter_queryset(self.get_queryset())

        page = self.paginate_queryset(queryset)
        if page is not None:
            serializer = self.get_serializer(page, many=True, context={'request': self.request})
            return self.get_paginated_response(serializer.data)

        serializer = self.get_serializer(queryset, many=True, context={'request': self.request})
        log.info('record response')
        return APIResponse(serializer.data)

    def retrieve(self, request, *args, **kwargs):
        log.info('record request')
        instance = self.get_object()
        serializer = self.get_serializer(instance, context={'request': self.request})
        log.info('record response')
        return APIResponse(serializer.data)

    def create(self, request, *args, **kwargs):
        log.info('record request')
        serializer = self.get_serializer(data=request.data, context={'request': self.request})
        if serializer.is_valid():
            model = serializer.Meta.model
            if hasattr(model, 'create_by'):
                serializer.save(create_by=request.user)
            if hasattr(model, 'update_by'):
                serializer.save(update_by=request.user)
            if not hasattr(model, 'create_by') and not hasattr(model, 'update_by'):
                serializer.save()
            headers = self.get_success_headers(serializer.data)
            log.info('record response')
            return APIResponse(serializer.data, msg='创建成功', headers=headers)
        raise ValidationError(serializer.errors)

    def update(self, request, *args, **kwargs):
        log.info('record request')
        partial = kwargs.pop('partial', False)
        instance = self.get_object()
        serializer = self.get_serializer(instance, data=request.data, partial=partial,
                                         context={'request': self.request})
        if serializer.is_valid():
            model = serializer.Meta.model
            if hasattr(model, 'update_by'):
                serializer.save(update_by=request.user)
            else:
                serializer.save()
            log.info('record response')
            return APIResponse(serializer.data, msg='更新成功')
        raise ValidationError(serializer.errors)

    def destroy(self, request, *args, **kwargs):
        log.info('record request')
        instance = self.get_object()
        self.perform_destroy(instance)
        if hasattr(instance, 'update_by'):
            instance.update_by = request.user
        if hasattr(instance, 'update_time'):
            instance.update_time = now()
        instance.save()
        log.info('record response')
        return APIResponse(msg='删除成功')

:::

这种自定义ViewSet的方式,可以在方法处进行日志记录,比在每个视图函数里记录要省很多事,明显好多了。

但是,我们还有一个需求是 希望每个请求到响应的日志中,有一个全局唯一的 log-uuid,并且通过响应 headers 返回给客户端

再继续改造改造 APIResponse,在返回时加一个 header,好像也是能做到的。但是仔细分析下这种方法,并不通用,因为假如自己没有继承自 ModelViewSet,而是继承自其他的视图类,那又要重新去弄了,不合理。

Middleware中记录

我们知道,当 django 注册了多个中间件时,先从上至下执行 process_request方法,再从下至上执行 process_response 方法。

::: tip

不了解 django 中间件的可以去百度搜索一下,很多讲解。

中间件主要有4个方法,本次我们主要接触 process_request 和 process_response

:::

根据我们的需求,需要记录日志,需要上报 sentry,需要生成一个全局的 log-uuid并且由 response.headers 返回给客户端。那拆解下,把记录日志作为一个中间件,把上报 sentry 作为一个中间件,互相隔离开,不在同一个中间件中完成。

假设日志记录的中间件为LoggerMiddleware,上报 sentry 的中间件为SentryMiddleware,可以大概完成代码如下

::: details 展开查看LoggerMiddleware代码

import time

import sentry_sdk
from django.contrib.auth.models import AnonymousUser
from django.utils.deprecation import MiddlewareMixin

from buggg.utils.log import log
from buggg.utils.utils import gen_uuid


class LoggerMiddleware(MiddlewareMixin):

    def process_request(self, request):
        request.body = str(request.body, encoding='utf-8').replace(' ', '').replace('\n',
                                                                                    '').replace('\t',
                                                                                                '')
        if 'HTTP_X_FORWARDED_FOR' in request.META:
            remote_address = request.META['HTTP_X_FORWARDED_FOR']
        else:
            remote_address = request.META['REMOTE_ADDR']
        request.remote_address = remote_address

        # 此处省略一些数据提取步骤

        log.info(request)  # 记录请求数据,注意此处是简写了,实际要对request数据进行一些处理

    def process_response(self, request, response):
        # 获取响应内容

        # 获取响应内容
        if response['content-type'] == 'application/json':
            if getattr(response, 'streaming', False):
                response_body = '<<>>'
            else:
                response_body = str(response.content, encoding='utf-8')
        else:
            response_body = '<<>>'

        request.response_body = response_body

        # 此处省略一些数据提取步骤

        log.info(response)  # 记录响应数据,注意此处是简写了,实际要对response数据进行一些处理
        return response


class SentryMiddleware(MiddlewareMixin):

    def process_request(self, request):

        request.body = str(request.body, encoding='utf-8').replace(' ', '').replace('\n',
                                                                                    '').replace('\t',
                                                                                                '')
        if 'HTTP_X_FORWARDED_FOR' in request.META:
            remote_address = request.META['HTTP_X_FORWARDED_FOR']
        else:
            remote_address = request.META['REMOTE_ADDR']
        request.remote_address = remote_address

        sentry_sdk.add_breadcrumb(
            category='remote_address',
            message=f'{request.remote_address}',
            level='info',
        )

        sentry_sdk.add_breadcrumb(
            category='body',
            message=request.request_body,
            level='info',
        )

    def process_response(self, request, response):

        return response

:::

通过这2个中间件的代码,可以发现,process_request里面存在一些相同的代码,而这些代码的作用就是获取 request 或者 response 信息的。既然这里存在相同的代码,那能不能把这部分再提取下呢。

我们再回忆一下中间件的处理顺序,先从上至下执行 process_request方法,再从下至上执行 process_response 方法,在执行的过程中,request 对象和 response 对象都是会一直传递的,传递的顺序就和执行的方法顺序一致。

假设我们定义这样两个中间件,第一个暂且称为ReqCollectionMiddleware,作用是提取 request 信息保存起来且传递下去;第二个暂且称为ResCollectionMiddleware,作用是提取 response 信息保存起来且传递下去。

ReqCollectionMiddleware是提取request,那么就要在process_request方法里面做文章了,我们可以在这个中间件的process_request里面把我们需要的 request 信息提取出来,保存到 request.META 中,而这个中间件的 process_response 则不需要做任何处理,实现代码如下:

::: details 展开查看代码

class ReqCollectionMiddleware(MiddlewareMixin):

    def process_request(self, request):

        log.info('进入ReqCollectionMiddleware,收集请求数据')

        request.META['REQUEST_BODY'] = json.loads(str(request.body, encoding='utf-8').replace(' ', '').replace('\n',
                                                                                                               '').replace(
            '\t',
            ''))

        if 'HTTP_X_FORWARDED_FOR' in request.META:
            remote_address = request.META['HTTP_X_FORWARDED_FOR']
        else:
            remote_address = request.META['REMOTE_ADDR']
        request.META['IP'] = remote_address

        request.META['LOG_UUID'] = gen_uuid()

    def process_response(self, request, response):

        return response

:::

同时,在第18行,可以看到我传递一个 uuid 给 request.META['LOG_UUID'],这将为后续做准备。

ResCollectionMiddleware是提取response,那么就要在 process_response 方法里面做文章了,我们可以在这个中间件的process_response里面把我们需要的 response 信息提取出来。由于我们在上面已经把需要的数据放进了request.META 中,所以我们索性将 response 的数据也保存到其中。而这个中间件的 process_request 则不需要做任何处理,实现代码如下:

::: details 展开查看代码

class ResCollectionMiddleware(MiddlewareMixin):

    def process_request(self, request):
        pass

    def process_response(self, request, response):

        log.info('进入ResCollectionMiddleware,收集响应数据')

        # 获取请求的 uid,如果是未登录的则为 None
        if not isinstance(request.user, AnonymousUser):
            uid = request.user.uid
        else:
            uid = None
        request.META['USER_UID'] = uid

        # 获取响应内容
        if response['content-type'] == 'application/json':
            if getattr(response, 'streaming', False):
                response_body = '<<>>'
            else:
                response_body = json.loads(str(response.content, encoding='utf-8'))
        else:
            response_body = '<<>>'
        request.META['RESP_BODY'] = response_body

        # 获取请求的 view 视图名称
        try:
            request.META['VIEW'] = request.resolver_match.view_name
        except AttributeError:
            request.META['VIEW'] = None

        request.META['STATUS_CODE'] = response.status_code

        # 设置 headers: X-Log-Id
        response.setdefault('X-Log-Id', request.META['LOG_UUID'])

        return response

:::

同时,在第36行,可以看到我为 response 设置了一个 header,名为X-Log-Id,值为我之前生成的 uuid

至此,通过这2个中间件,我们就完成了 request 和 response 信息提取,以及生成一个 log-uuid,并且还能在 response headers里面返回给客户端,这样有问题就好追溯了。那么接下来,LoggerMiddleware 和 SentryMiddleware 就可以简化一些了。

::: details 展开查看LoggerMiddleware代码

class LoggerMiddleware(MiddlewareMixin):
    """
    中间件,记录日志
    """

    def process_request(self, request):
        pass

    def process_response(self, request, response):
        log.info('进入LoggerMiddleware,写日志')
        log_data = {
            'request': {
                'uid': request.META['USER_UID'],
                "ip": request.META['IP'],
                "method": request.method,
                'path': request.get_full_path(),
                'view': request.META['VIEW'],
                'body': request.META['REQUEST_BODY'],
                'headers': _get_request_headers(request),
            },
            'response': {
                'status': request.META['STATUS_CODE'],
                'body': request.META['RESP_BODY'],
                'headers': _get_response_headers(response),
            },
            'log_uuid': request.META['LOG_UUID']
        }
        log.info(json.dumps(log_data))
        return response

:::

::: details 展开查看SentryMiddleware代码

class SentryMiddleware(MiddlewareMixin):

    def process_request(self, request):
        pass

    def process_response(self, request, response):
        log.info('进入SentryMiddleware,上报请求至Sentry')
        sentry_sdk.add_breadcrumb(
            category='path',
            message=request.path,
            level='info',
        )

        sentry_sdk.add_breadcrumb(
            category='body',
            message=request.META["REQUEST_BODY"],
            level='info',
        )

        sentry_sdk.add_breadcrumb(
            category='request_headers',
            message=_get_request_headers(request),
            level='info',
        )

        sentry_sdk.add_breadcrumb(
            category='response_headers',
            message=_get_response_headers(response),
            level='info',
        )

        sentry_sdk.add_breadcrumb(
            category='view',
            message=request.META['VIEW'],
            level='info',
        )
        sentry_sdk.set_user({"id": request.META['USER_UID']})
        sentry_sdk.set_tag("log-id", request.META["LOG_UUID"])
        sentry_sdk.capture_message(request.META["LOG_UUID"])
        return response

:::

::: tip 提示

由于没有提供直接获取 headers 的方法,所以

_get_response_headers() 是自定义的获取完整 response headers 的方法

_get_request_headers() 是自定义的获取完整 request headers 的方法

代码在后面会给出

:::

再次优化

我发现,可以将ReqCollectionMiddleware和ResCollectionMiddleware合并为一个中间件CollectionMiddleware,这样并不会影响整个流程

class CollectionMiddleware(MiddlewareMixin):

    def process_request(self, request):

        log.info('进入CollectionMiddleware,收集请求数据')

        request.META['REQUEST_BODY'] = json.loads(str(request.body, encoding='utf-8').replace(' ', '').replace('\n',
                                                                                                               '').replace(
            '\t',
            ''))

        if 'HTTP_X_FORWARDED_FOR' in request.META:
            remote_address = request.META['HTTP_X_FORWARDED_FOR']
        else:
            remote_address = request.META['REMOTE_ADDR']
        request.META['IP'] = remote_address

        request.META['LOG_UUID'] = gen_uuid()


    def process_response(self, request, response):

        log.info('进入CollectionMiddleware,收集响应数据')

        # 获取请求的 uid,如果是未登录的则为 None
        if not isinstance(request.user, AnonymousUser):
            uid = request.user.uid
        else:
            uid = None
        request.META['USER_UID'] = uid

        # 获取响应内容
        if response['content-type'] == 'application/json':
            if getattr(response, 'streaming', False):
                response_body = '<<>>'
            else:
                response_body = json.loads(str(response.content, encoding='utf-8'))
        else:
            response_body = '<<>>'
        request.META['RESP_BODY'] = response_body

        # 获取请求的 view 视图名称
        try:
            request.META['VIEW'] = request.resolver_match.view_name
        except AttributeError:
            request.META['VIEW'] = None

        request.META['STATUS_CODE'] = response.status_code

        # 设置 headers: X-Log-Id
        response.setdefault('X-Log-Id', request.META['LOG_UUID'])

        return response

中间件注册顺序

到现在,有4个中间件了

CollectionMiddleware 收集请求、响应数据,分别在 process_request() 和process_response()中执行

LoggerMiddleware写日志,在 process_response()中执行

SentryMiddleware上报请求数据,在 process_response()中执行

前面说过,process_response()是从下到上顺序执行,那作为收集响应数据的CollectionMiddleware,应该排在最后一个

LoggerMiddleware和SentryMiddleware都是在process_response()里面进行记录和上报的,所以应该在CollectionMiddleware收集响应数据后再执行,那么应该在CollectionMiddleware上面。

那么初步的顺序从上往下就是

LoggerMiddleware

SentryMiddleware

CollectionMiddleware

其中 LoggerMiddleware 和 SentryMiddleware 不分先后

考虑到 django 自带有一些中间件,会处理一些数据,而我们的LoggerMiddleware和SentryMiddleware肯定是需要记录最终的数据,那么将LoggerMiddleware和SentryMiddleware放在所有中间件最上面;

同理,CollectionMiddleware则放在最下面;

MIDDLEWARE = [
    'buggg.core.middlewares.LoggerMiddleware',
    'buggg.core.middlewares.SentryMiddleware',
    'corsheaders.middleware.CorsMiddleware',
    'django.middleware.security.SecurityMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    # 'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'buggg.core.middlewares.CollectionMiddleware'
]

演示

我们来看下实际调用的演示

Postman 发送请求

image-20210131123358918

控制台日志输出

image-20210131123434909

这个json 串如下:

::: details 展开查看 json

{
    "request":{
        "uid":"unu4cx915z4wkbnw",
        "ip":"127.0.0.1",
        "method":"POST",
        "path":"/api/user/role",
        "view":"apps.user.views.RoleView",
        "body":{
            "role_name":"super1",
            "role_code":"SUPER1",
            "role_desc":"superadminrole"
        },
        "headers":{
            "authorization":"Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE2MTE5NzUwMTUsImV4cCI6NDc2Nzc1NjYxNSwic3ViIjoiUkFJTkJPVyIsInVpZCI6InVudTRjeDkxNXo0d2tibncifQ.uFM1LOkb01SxQXXRF4q03jAi753Q-BntcEpJ-tTCX5Y",
            "user_agent":"PostmanRuntime/7.26.8",
            "accept":"*/*",
            "postman_token":"d1f825ba-cadb-4b0e-a31f-d49bb2b5e033",
            "host":"0.0.0.0:8000",
            "accept_encoding":"gzip, deflate, br",
            "connection":"keep-alive"
        }
    },
    "response":{
        "status":200,
        "body":{
            "code":20200,
            "msg":"创建成功",
            "data":{
                "id":3,
                "role_name":"super1",
                "role_code":"SUPER1",
                "role_desc":"super admin role"
            },
            "timestamp":1612067614
        },
        "headers":{
            "Content-Type":"application/json",
            "Vary":"Accept, Origin",
            "Allow":"GET, POST, HEAD, OPTIONS",
            "X-Log-Id":"7a6a233e637d11eb9f36acde48001122",
            "X-Frame-Options":"DENY",
            "Content-Length":"147",
            "X-Content-Type-Options":"nosniff",
            "Referrer-Policy":"same-origin"
        }
    },
    "log_uuid":"7a6a233e637d11eb9f36acde48001122"
}

:::

接口响应 headers 数据,有我们自定义的 x-log-id

image-20210131123632415

sentry 页面显示

image-20210131123952528

总结

至此,通过中间件记录日志,上报 sentry 就完成了,这种方式的好处是,各个中间件各司其职,而且可以方便以后增加中间件,不会有影响,也不会有冗余代码,个人觉得是我能想到的稍微好点的处理方式。

当然,以上只是我个人的看法,如果你有更好的解决方案,欢迎讨论。

最终代码

::: details 中间件代码

import json

import sentry_sdk
from django.contrib.auth.models import AnonymousUser
from django.utils.deprecation import MiddlewareMixin

from buggg.utils.log import log
from buggg.utils.utils import gen_uuid


def _get_request_headers(request):
    headers = {}
    for k, v in request.META.items():
        if k.startswith('HTTP_'):
            headers[k[5:].lower()] = v
    return headers


def _get_response_headers(response):
    headers = {}
    headers_tuple = response.items()
    for i in headers_tuple:
        headers[i[0]] = i[1]
    return headers


NOT_SUPPORT_PATH = '/admin'  # 排除 admin 站点,admin 站点不会进入CollectionMiddleware的process_response方法,会导致报错


class CollectionMiddleware(MiddlewareMixin):

    def process_request(self, request):

        # 增加判断,如果请求的 path 是以/admin/开头的,则直接放过,不做任何处理
        if request.path.startswith(NOT_SUPPORT_PATH):
            pass
        else:

            log.info('进入CollectionMiddleware,收集请求数据')
            request.META['REQUEST_BODY'] = json.loads(str(request.body, encoding='utf-8').replace(' ', '').replace('\n',
                                                                                                                   '').replace(
                '\t',
                ''))

            if 'HTTP_X_FORWARDED_FOR' in request.META:
                remote_address = request.META['HTTP_X_FORWARDED_FOR']
            else:
                remote_address = request.META['REMOTE_ADDR']
            request.META['IP'] = remote_address

            request.META['LOG_UUID'] = gen_uuid()

    def process_response(self, request, response):
        # 增加判断,如果请求的 path 是以/admin/开头的,则直接放过,不做任何处理
        if request.path.startswith(NOT_SUPPORT_PATH):
            pass
        else:

            log.info('进入CollectionMiddleware,收集响应数据')

            # 获取请求的 uid,如果是未登录的则为 None
            if not isinstance(request.user, AnonymousUser):
                uid = request.user.uid
            else:
                uid = None
            request.META['USER_UID'] = uid

            # 获取响应内容
            if response['content-type'] == 'application/json':
                if getattr(response, 'streaming', False):
                    response_body = '<<>>'
                else:
                    response_body = json.loads(str(response.content, encoding='utf-8'))
            else:
                response_body = '<<>>'
            request.META['RESP_BODY'] = response_body

            # 获取请求的 view 视图名称
            try:
                request.META['VIEW'] = request.resolver_match.view_name
            except AttributeError:
                request.META['VIEW'] = None

            request.META['STATUS_CODE'] = response.status_code

            # 设置 headers: X-Log-Id
            response.setdefault('X-Log-Id', request.META['LOG_UUID'])

        return response


class LoggerMiddleware(MiddlewareMixin):
    """
    中间件,记录日志
    """

    def process_request(self, request):
        pass

    def process_response(self, request, response):
        # 增加判断,如果请求的 path 是以/admin/开头的,则直接放过,不做任何处理
        if request.path.startswith(NOT_SUPPORT_PATH):
            pass
        else:
            log.info('进入LoggerMiddleware,写日志')
            log_data = {
                'request': {
                    'uid': request.META['USER_UID'],
                    "ip": request.META['IP'],
                    "method": request.method,
                    'path': request.get_full_path(),
                    'view': request.META['VIEW'],
                    'body': request.META['REQUEST_BODY'],
                    'headers': _get_request_headers(request),
                },
                'response': {
                    'status': request.META['STATUS_CODE'],
                    'body': request.META['RESP_BODY'],
                    'headers': _get_response_headers(response),
                },
                'log_uuid': request.META['LOG_UUID']
            }
            log.info(json.dumps(log_data))
        return response


class SentryMiddleware(MiddlewareMixin):
    """
    上报 sentry
    """

    def process_request(self, request):
        pass

    def process_response(self, request, response):
        # 增加判断,如果请求的 path 是以/admin/开头的,则直接放过,不做任何处理
        if request.path.startswith(NOT_SUPPORT_PATH):
            pass
        else:
            log.info('进入SentryMiddleware,上报请求至Sentry')
            sentry_sdk.add_breadcrumb(
                category='path',
                message=request.path,
                level='info',
            )

            sentry_sdk.add_breadcrumb(
                category='body',
                message=request.META["REQUEST_BODY"],
                level='info',
            )

            sentry_sdk.add_breadcrumb(
                category='request_headers',
                message=_get_request_headers(request),
                level='info',
            )

            sentry_sdk.add_breadcrumb(
                category='response_headers',
                message=_get_response_headers(response),
                level='info',
            )

            sentry_sdk.add_breadcrumb(
                category='view',
                message=request.META['VIEW'],
                level='info',
            )
            sentry_sdk.set_user({"id": request.META['USER_UID']})
            sentry_sdk.set_tag("log-id", request.META["LOG_UUID"])
            sentry_sdk.capture_message(request.META["LOG_UUID"])
        return response

:::

个人博客

我的个人博客在这里哦~

你可能感兴趣的:(Django Rest Framework 通过 Middleware来生成全局唯一的 log-uuid、记录日志、上报 sentry)