首页 > 解决方案 > MLFlow 项目在运行时抛出 JSONDecode 错误

问题描述

我正在尝试使用 MLFlow CLI 运行 MLFlow 项目,并且遵循教程会导致错误。对于我尝试从 CLI 运行的任何项目,我都会收到以下错误

Traceback (most recent call last):
  File "/home/rbc/.local/bin/mlflow", line 11, in <module>
    sys.exit(cli())
  File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/rbc/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/cli.py", line 139, in run
    run_id=run_id,
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 230, in run
    storage_dir=storage_dir, block=block, run_id=run_id)
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 88, in _run
    active_run = _create_run(uri, experiment_id, work_dir, entry_point)
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/projects/__init__.py", line 579, in _create_run
    active_run = tracking.MlflowClient().create_run(experiment_id=experiment_id, tags=tags)
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/tracking/client.py", line 101, in create_run
    source_version=source_version
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 156, in create_run
    response_proto = self._call_endpoint(CreateRun, req_body)
  File "/home/rbc/.local/lib/python3.6/site-packages/mlflow/store/rest_store.py", line 66, in _call_endpoint
    js_dict = json.loads(response.text)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

这是我用来启动运行的命令类型的示例,它直接来自教程

mlflow run https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine -m databricks -c cluster-spec.json --experiment-id 72647065958042 -P alpha=2.0 -P l1_ratio=0.5

我已将错误追溯到涉及 MLFLow 在尝试开始运行时返回空的内容,但我可以使用我正在连接的 Databricks 环境成功运行 MLFlow 实验,所以我不确定问题出在哪里,我正在运行Ubuntu 18.04 上的 MLFlow 0.9.1

标签: mlflow

解决方案


not sure if you have solved your issue, but here is how I fixed it:

  1. the databricks-cli work with the following config without problem:

    host = https://xxx.databricks.net/?o=<org_id>
    token=dapixxx
    
  2. but mlflow not quit happy about that, change it to:

    host = https://xxx.databricks.net
    username = token
    password = dapixxx
    

推荐阅读