如何将下载的模型从Azure ParallelRunConfig传递到Entrypoint脚本

avkwfej4 于 2023-08-07 发布在其他

关注(0)|答案(1)|浏览(90)

我想使用Azure并行运行步骤对 Dataframe 执行预测。加载模型有两种方法，一种是在ParallelRunConfig脚本中下载模型并将其传递给子节点，另一种是在子节点上下载模型。
我可以使用ParallelRunConfig中的environment_variables参数将所需的对象传递给子节点，但它不支持传递.pkl文件。
使用第二种方法，即直接在子节点上下载模型是相当昂贵的，并且重复地多次下载相同的模型。
如何将ParllelRunConfig Python文件中下载的pkl文件传递给子节点？

Azure

来源：https://stackoverflow.com/questions/76715748/how-to-pass-downloaded-models-from-azure-parallelrunconfig-to-entrypoint-script

1条答案

按热度按时间

biswetbf1#

要执行此任务，您可以执行以下步骤：
1.在workspace中注册pickle模型。

的数据
1.使用entry script中的init()函数加载模型。

的
1.在run(mini_batch)函数中使用它进行处理。

的
有了这个，我就可以用ParallelRunConfig和pickle model执行作业了。

的
请参考sample notebook，其中提供了使用pickle模型与ParallelRunConfig的详细步骤。

更新：

下面是ParallelRunConfig代码片段

from azureml.pipeline.steps import ParallelRunStep, ParallelRunConfig
parallel_run_config = ParallelRunConfig(
    source_directory=scripts_folder,
    entry_script=script_file, 
    mini_batch_size='1KB',
    error_threshold=5,
    output_action='append_row',
    append_row_file_name="iris_outputs.txt",
    environment=predict_env,
    compute_target=compute_target, 
    node_count=2,
    run_invocation_timeout=600
)

字符串
详细信息：

以下是工作区中已注册模型的列表。

使用模型