我正在使用Azure ML作业运行一个使用Python sdk-v2的实验,但在运行完成后我无法访问运行日志。我不确定发生了什么,是我缺少某些权限还是上一步。它只是显示“run 'xxxx' not found
from mlflow.tracking import MlflowClient
# Use MlFlow to retrieve the job that was just completed
run_id = 'musing_steelpan_xxxx'
finished_mlflow_run = MlflowClient().get_run(run_id)
。run_id实际存在,我是工作空间和集群的所有者。
MlflowException Traceback (most recent call last)
Cell In [5], line 6
3 # Use MlFlow to retrieve the job that was just completed
4 run_id = 'musing_steelpan_hnlbhxf9qy'
----> 6 finished_mlflow_run = MlflowClient().get_run(run_id)
File /miniconda/envs/benchmark/lib/python3.8/site-packages/mlflow/tracking/client.py:150, in MlflowClient.get_run(self, run_id)
112 def get_run(self, run_id: str) -> Run:
113 """
114 Fetch the run from backend store. The resulting :py:class:`Run <mlflow.entities.Run>`
115 contains a collection of run metadata -- :py:class:`RunInfo <mlflow.entities.RunInfo>`,
(...)
148 status: FINISHED
149 """
--> 150 return self._tracking_client.get_run(run_id)
File /miniconda/envs/benchmark/lib/python3.8/site-packages/mlflow/tracking/_tracking_service/client.py:72, in TrackingServiceClient.get_run(self, run_id)
58 """
59 Fetch the run from backend store. The resulting :py:class:`Run <mlflow.entities.Run>`
60 contains a collection of run metadata -- :py:class:`RunInfo <mlflow.entities.RunInfo>`,
(...)
69 raises an exception.
70 """
71 _validate_run_id(run_id)
...
648 )
649 run_info = self._get_run_info_from_dir(run_dir)
650 if run_info.experiment_id != exp_id:
MlflowException: Run 'musing_steelpan_xxxx' not found
2条答案
按热度按时间6rqinv9w1#
在某些情况下(例如,对于管道内的作业、扫描内的作业),显示在门户顶部的
display_name
(可由用户更改)与作业的name
(不可变)不同,并显示在门户的更下方(见下图)。您是从门户网站获取了
name
还是display_name
(或者它们是相同的)?6gpjuf902#
这里有另一个想法:您可能未联机到正确的工作区。您可以透过MLFLOW_TRACKING_URI或直接参数来设定工作区至mlflow从属端。请尝试移至Azure Portal并查看工作区特性--您可以在此找到工作区的MLFLOW追踪URI:
然后你可以把它插入下面的代码--这应该会打印出你的工作区的100次运行(我相信前100次...):
要使上述代码正常工作,您需要:
1.第一个月
az login