首页 > 解决方案 > 如何在 systemd 中使用 Airflow 调度程序?

问题描述

文档指定了集成说明

我想要的是每次调度程序停止工作时,它都会由它自己重新启动。通常我手动启动它,airflow scheduler -D但有时它会在我不可用时停止。

阅读文档我不确定配置。

GitHub包含以下文件:

airflow
airflow-scheduler.service
airflow.conf

我正在运行 Ubuntu 16.04

气流安装在:

home/ubuntu/airflow

我有以下路径:

etc/systemd

文档说:

将它们复制(或链接)到 /usr/lib/systemd/system

  1. 复制哪个文件?

将airflow.conf 复制到/etc/tmpfiles.d/

  1. 什么是 tmpfiles.d ?

  2. # AIRFLOW_CONFIG=气流文件中有什么?

或者换句话说……关于如何做的更“脚踏实地”的指南?

标签: airflow

解决方案


Integrating Airflow with systemd files makes watching your daemons easy as systemd can take care of restarting a daemon on failure. This also enables to automatically start airflow webserver and scheduler on system start.

Edit the airflow file from systemd folder in Airflow Github as per the current configuration to set the environment variables for AIRFLOW_CONFIG, AIRFLOW_HOME & SCHEDULER.

Copy the services files (the files with .service extension) to /usr/lib/systemd/system in the VM.

Copy the airflow.conf file to /etc/tmpfiles.d/ or /usr/lib/tmpfiles.d/. Copying airflow.conf ensures /run/airflow is created with the right owner and permissions (0755 airflow airflow). Check whether /run/airflow exist with airflow:airflow owned by airflow user and airflow group if it doesn't create /run/airflowfolder with those permissions.

Enable this services by issuing systemctl enable <service> on command line as shown below.

sudo systemctl enable airflow-webserver
sudo systemctl enable airflow-scheduler

airflow-scheduler.service file should be as below:

[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service

[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=/bin/airflow scheduler
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target

推荐阅读