google-cloud-platform - 如何在作曲家/气流中清楚地列出所有连接?
问题描述
我正在尝试使用基础设施即代码的方法创建一个作曲家环境。为此,我需要以编程方式存储和检索气流变量,并将它们版本化。
在上一篇文章中,Ed Morton写了一个脚本将表格转换为 JSON,但是在使用以下命令时,composer/airflow 输出数据的方式存在问题:
gcloud composer environments run `$COMPOSER_ENV` --location <location> connections -- --list
输出样本为:
╒════════════════════════════════╤═════════════════════════════╤════════════════════════════════╤════════╤════════════════╤══════════════════════╤════════════════════════════════╕
│ Conn Id │ Conn Type │ Host │ Port │ Is Encrypted │ Is Extra Encrypted │ Extra │
╞════════════════════════════════╪═════════════════════════════╪════════════════════════════════╪════════╪════════════════╪══════════════════════╪════════════════════════════════╡
│ 'airflow_db' │ 'mysql' │ 'airflow-sqlp...rvice.default' │ None │ True │ False │ None │
├────────────────────────────────┼─────────────────────────────┼────────────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
如您所见,问题在于Host
,Extra
列包含省略长文本的省略号...
,例如 here 'airflow-sqlp...rvice.default'
。
如何获得上述 ( composer
) 实用程序输出的完整版信息?
我正在使用composer-1.12.1-airflow-1.10.9
. 不幸的是,使用 CLI 将连接导出到 JSON 的好功能仅在最新版本的气流中可用。
解决方案
我正在研究 Airflow,但从未使用过作曲家。但是,从文档中得知gcloud composer environments run
远程运行 Airflow CLI 子命令。
Airflow CLI 有一个打开 DB shell 的选项airflow shell
,它能够接收来自stdin
. 因此,我尝试通过管道输入 SQL 语句来检索连接,并且成功了。
> echo "select * from connection limit 3;" | airflow shell
/usr/local/Caskroom/miniconda/base/envs/airflow-demo/lib/python3.7/site-packages/airflow/configuration.py:761: DeprecationWarning: You have two airflow.cfg files: /Users/arunvelsriram/airflow/airflow.cfg and /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg. Airflow used to look at ~/airflow/airflow.cfg, even when AIRFLOW_HOME was set to a different value. Airflow will now only read /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg, and you should remove the other file
category=DeprecationWarning,
DB: sqlite:///airflow_home/airflow.db
1|airflow_db|mysql|mysql|airflow|root||||0|0
2|beeline_default|beeline|localhost|default|||10000|{"use_beeline": true, "auth": ""}|0|0
3|bigquery_default|google_cloud_platform||default|||||0|0
我们也可以将结果提取为json
或csv
。大多数数据库都支持它。例如在 sqlite 中:
> echo "select
json_group_array(
json_object(
'id', id,
'conn_id', conn_id,
'conn_type', conn_type,
'host', host, 'schema', schema,
'login', login,
'password', password,
'port', port,
'extra', extra,
'is_encrypted', is_encrypted,
'is_extra_encrypted', is_extra_encrypted
)
) as json_result
from (select * from connection limit 3);" | airflow shell
/usr/local/Caskroom/miniconda/base/envs/airflow-demo/lib/python3.7/site-packages/airflow/configuration.py:761: DeprecationWarning: You have two airflow.cfg files: /Users/arunvelsriram/airflow/airflow.cfg and /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg. Airflow used to look at ~/airflow/airflow.cfg, even when AIRFLOW_HOME was set to a different value. Airflow will now only read /Users/arunvelsriram/spikes/airflow/airflow-demo/airflow_home/airflow.cfg, and you should remove the other file
category=DeprecationWarning,
DB: sqlite:///airflow_home/airflow.db
[{"id":1,"conn_id":"airflow_db","conn_type":"mysql","host":"mysql","schema":"airflow","login":"root","password":null,"port":null,"extra":null,"is_encrypted":0,"is_extra_encrypted":0},{"id":2,"conn_id":"beeline_default","conn_type":"beeline","host":"localhost","schema":"default","login":null,"password":null,"port":10000,"extra":"{\"use_beeline\": true, \"auth\": \"\"}","is_encrypted":0,"is_extra_encrypted":0},{"id":3,"conn_id":"bigquery_default","conn_type":"google_cloud_platform","host":null,"schema":"default","login":null,"password":null,"port":null,"extra":null,"is_encrypted":0,"is_extra_encrypted":0}]
我无法尝试它作曲家,因为我没有作曲家环境。这只是我能想到的一个技巧,因为当前版本的 Airflow CLI 没有可配置的输出。
推荐阅读
- scala - 如何比较从 RDBMS 表读取的数据帧的模式与 Hive 上的同一表?
- c++ - 将字符串变量分配给 unsigned char 变量
- scala - 与 CSV 之间的嵌套 Scala 案例类
- ios - 从应用程序向 ARKit 添加参考图像
- android - 从android中的动态链接获取实际的url
- .net - 从 Visual Studio Team Services REST API 获取所有工作项
- php - 托管时无法访问基于 id 的 URL。文章.php?id=10
- dropwizard - 使用 Jersey 发出 Post 请求时服务器端的 Classcast 异常
- python - 列表排序返回意外输出
- cakephp - CakePHP 3 启用 twig dump()