工作流Airflow的性能优化,应对dag数目的激增

问题描述

使用Airflow作为工作流引擎,目前面临一个问题,因为dag数的激增,导致airflow的页面无法打开,非常非常的慢,通过后台看airflow的gunicorn负载非常的高。尝试调整配置,无论是开4个worker还是8个worker都非常的高。无法解决页面卡死的情况,需要亟待解决airflow的webserver的负载问题,如何能够使他轻量化。

按照官方文档的建议,设置store_serialized_dags = True,可以使得webserver不需要轮训dags目录下的file文件,而是从DB持久化中读取,大大提高airflow中webserver的性能

# Whether to serialise DAGs and persist them in DB.
# If set to True, Webserver reads from DB instead of parsing DAG files
# More details: https://airflow.apache.org/docs/stable/dag-serialization.html
store_serialized_dags = True
compress_serialized_dags = False

按照官方文档建议修改过后,webserver的负载确实降低了,但是新问题来了。airflow页面里的任何一个dag都无法正常的进去,以及查看webserver的flask访问日志,看到查看dag详情的时候都会报错
ValueError: DAG ‘xxxx’ not found in serialized_dag table

根据官方文档对store_serialized_dags配置的解释,猜测应该是无法将这些dag入库持久化。
根据这快源码来推测self._add_dag_from_db(dag_id=dag_id)
dag insert db应该是scheduler的工作,那么去看看scheduler的日志

schueduler的日志如下报错,它对所有的dag解析都报错
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xe3 in position 449: ordinal not in range(128)
工作流Airflow的性能优化,应对dag数目的激增_第1张图片

解决方案:

通过看codecs.ascii_decode(input, self.errors)[0]代码,往上推,结合网上对错误的解释,觉得应该是scheduler解析dag时候遇到了中文,所以无法进行serializer操作,最终无法入库

File "/usr/local/lib/python3.6/site-packages/airflow/models/dagcode.py", line 189, in _get_code_from_file
    code = f.read()
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]

通过看code = f.read()对应的open_maybe_zipped源码,可以知道这里读取文件之后返回,那么在读文件这里强制设置编码格式

def open_maybe_zipped(fileloc, mode='r'):
    """
    Opens the given file. If the path contains a folder with a .zip suffix, then
    the folder is treated as a zip archive, opening the file inside the archive.

    :return: a file object, as in `open`, or as in `ZipFile.open`.
    """
    _, archive, filename = ZIP_REGEX.search(fileloc).groups()
    if archive and zipfile.is_zipfile(archive):
        return zipfile.ZipFile(archive, mode=mode).open(filename)
    else:
        return io.open(fileloc, mode=mode)

将如上最后一行代码o.open(fileloc, mode=mode)改为io.open(fileloc, mode=mode, encoding=‘utf-8’)

def open_maybe_zipped(fileloc, mode='r'):
    """
    Opens the given file. If the path contains a folder with a .zip suffix, then
    the folder is treated as a zip archive, opening the file inside the archive.

    :return: a file object, as in `open`, or as in `ZipFile.open`.
    """
    _, archive, filename = ZIP_REGEX.search(fileloc).groups()
    if archive and zipfile.is_zipfile(archive):
        return zipfile.ZipFile(archive, mode=mode).open(filename)
    else:
        return io.open(fileloc, mode=mode, encoding='utf-8')

重启webserver和scheduler,一切解析,调度均正常。
而且webserver的负载降低了,可以很快的浏览任意内容,不会受到dag数的增多而影响airflow的web效率

你可能感兴趣的:(python,后端)