首页 > 解决方案 > 如何在作曲家/气流中清楚地列出所有连接?

问题描述

我正在尝试使用基础设施即代码的方法创建一个作曲家环境。为此,我需要以编程方式存储和检索气流变量,并将它们版本化。

上一篇文章中,Ed Morton写了一个脚本将表格转换为 JSON,但是在使用以下命令时,composer/airflow 输出数据的方式存在问题:

gcloud composer environments run `$COMPOSER_ENV` --location <location> connections -- --list

输出样本为:

╒════════════════════════════════╤═════════════════════════════╤════════════════════════════════╤════════╤════════════════╤══════════════════════╤════════════════════════════════╕
│ Conn Id                        │ Conn Type                   │ Host                           │ Port   │ Is Encrypted   │ Is Extra Encrypted   │ Extra                          │
╞════════════════════════════════╪═════════════════════════════╪════════════════════════════════╪════════╪════════════════╪══════════════════════╪════════════════════════════════╡
│ 'airflow_db'                   │ 'mysql'                     │ 'airflow-sqlp...rvice.default' │ None   │ True           │ False                │ None                           │
├────────────────────────────────┼─────────────────────────────┼────────────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤

如您所见,问题在于Host,Extra列包含省略长文本的省略号...,例如 here 'airflow-sqlp...rvice.default'

如何获得上述 ( composer) 实用程序输出的完整版信息?

我正在使用composer-1.12.1-airflow-1.10.9. 不幸的是,使用 CLI 将连接导出到 JSON 的好功能仅在最新版本的气流中可用

标签: google-cloud-platformairflowgoogle-cloud-composer

解决方案


我正在研究 Airflow,但从未使用过作曲家。但是,从文档中得知gcloud composer environments run远程运行 Airflow CLI 子命令。

Airflow CLI 有一个打开 DB shell 的选项airflow shell,它能够接收来自stdin. 因此,我尝试通过管道输入 SQL 语句来检索连接,并且成功了。

> echo "select * from connection limit 3;" | airflow shell
/usr/local/Caskroom/miniconda/base/envs/airflow-demo/lib/python3.7/site-packages/airflow/configuration.py:761: DeprecationWarning: You have two airflow.cfg files: /Users/arunvelsriram/airflow/airflow.cfg and /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg. Airflow used to look at ~/airflow/airflow.cfg, even when AIRFLOW_HOME was set to a different value. Airflow will now only read /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg, and you should remove the other file
  category=DeprecationWarning,
DB: sqlite:///airflow_home/airflow.db
1|airflow_db|mysql|mysql|airflow|root||||0|0
2|beeline_default|beeline|localhost|default|||10000|{"use_beeline": true, "auth": ""}|0|0
3|bigquery_default|google_cloud_platform||default|||||0|0

我们也可以将结果提取为jsoncsv。大多数数据库都支持它。例如在 sqlite 中:

> echo "select
json_group_array(
        json_object(
        'id', id,
        'conn_id', conn_id,
        'conn_type', conn_type,
        'host', host, 'schema', schema,
        'login', login,
        'password', password,
        'port', port,
        'extra', extra,
        'is_encrypted', is_encrypted,
        'is_extra_encrypted', is_extra_encrypted
    )
) as json_result
from (select * from connection limit 3);" | airflow shell
/usr/local/Caskroom/miniconda/base/envs/airflow-demo/lib/python3.7/site-packages/airflow/configuration.py:761: DeprecationWarning: You have two airflow.cfg files: /Users/arunvelsriram/airflow/airflow.cfg and /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg. Airflow used to look at ~/airflow/airflow.cfg, even when AIRFLOW_HOME was set to a different value. Airflow will now only read /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg, and you should remove the other file
  category=DeprecationWarning,
DB: sqlite:///airflow_home/airflow.db
[{"id":1,"conn_id":"airflow_db","conn_type":"mysql","host":"mysql","schema":"airflow","login":"root","password":null,"port":null,"extra":null,"is_encrypted":0,"is_extra_encrypted":0},{"id":2,"conn_id":"beeline_default","conn_type":"beeline","host":"localhost","schema":"default","login":null,"password":null,"port":10000,"extra":"{\"use_beeline\": true, \"auth\": \"\"}","is_encrypted":0,"is_extra_encrypted":0},{"id":3,"conn_id":"bigquery_default","conn_type":"google_cloud_platform","host":null,"schema":"default","login":null,"password":null,"port":null,"extra":null,"is_encrypted":0,"is_extra_encrypted":0}]

我无法尝试它作曲家,因为我没有作曲家环境。这只是我能想到的一个技巧,因为当前版本的 Airflow CLI 没有可配置的输出。


推荐阅读