azure 部署模型时需要GPU(cuda)访问

nwnhqdif  于 2023-06-07  发布在  其他
关注(0)|答案(1)|浏览(319)

我需要部署预训练模型的帮助。我已经为部署过程创建了一个自定义的score.py文件。然而,在CPU示例上创建的docker不提供对GPU的访问,这给使用PyTorch或TensorFlow模型进行预测带来了问题,因为它们需要将输入转换为加载在GPU上的Tensor。你能提出一个解决办法吗?
我的score.py脚本-

import something

# original = torch.load

# def load(*args):
#     return torch.load(*args, map_location=torch.device("cpu"),pickle_module=None)

# def init():
#     global model
#     model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "use-case1-model")
#     # "model" is the path of the mlflow artifacts when the model was registered. For automl
#     # models, this is generally "mlflow-model".

#     with mock.patch("torch.load", load):
#         model = mlflow.pyfunc.load_model(model_path)

#     logging.info("Init complete")

def init():
    global model

    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "use-case1-model")

    model = mlflow.pytorch.load_model(model_path, map_location=torch.device('cpu'))
    logging.info("Init complete")

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
            

def run(data):

    json_data = json.loads(data) 

    title = json_data["input_data"]["title"]
    att = json_data["input_data"]["attributes"]
    
    result = {}

    for i in range(len(title)):

        my_dict = {}
        for j in range(len(att)):
            
            attr = att[i][j]

            t, a = nobert4token(tokenizer, title[i].lower(), attr)

            x = X_padding(t)
            y = tag_padding(a)

            tensor_a = torch.tensor(y, dtype=torch.int32)
            tensor_a = torch.unsqueeze(tensor_a, dim=0).to("cuda")

            tensor_t = torch.tensor(x, dtype=torch.int32)
            tensor_t = torch.unsqueeze(tensor_t, dim=0).to("cuda")

            output = model([tensor_t, tensor_a])

            predict_list = output.tolist()[0]
            
            my_dict[attr] = " ".join(words_p)

        result[title[i]] = my_dict

    return result

我的调用脚本-

ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_result.name,
    deployment_name=green_deployment_uc1.name,
    request_file=os.path.join("./dependencies", "sample.json"),
)

我的康达。yaml-

channels:
  - conda-forge
dependencies:
  - python=3.8
  - pip=22.1.2
  - numpy=1.21.2
  - scikit-learn=0.24.2
  - scipy=1.7.1
  - 'pandas>=1.1,<1.2'
  - pytorch=1.10.0
  - pip:
      - 'inference-schema[numpy-support]==1.5.0'
      - xlrd==2.0.1
      - mlflow== 1.26.1
      - azureml-mlflow==1.42.0
      - tqdm==4.63.0
      - pytorch-transformers==1.2.0
      - pytorch-lightning==2.0.2
      - seqeval==1.2.2
      - azureml-inference-server-http==0.8.0
name: model-env

我得到的错误-

127.0.0.1 - - [29/May/2023:10:03:32 +0000] "GET / HTTP/1.0" 200 7 "-" "kube-probe/1.18"
2023-05-29 10:03:34,291 E [70] azmlinfsrv - Encountered Exception: Traceback (most recent call last):
  File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 130, in invoke_run
    run_output = self._wrapped_user_run(**run_parameters, request_headers=dict(request.headers))
  File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 154, in <lambda>
    self._wrapped_user_run = lambda request_headers, **kwargs: self._user_run(**kwargs)
  File "/var/azureml-app/dependencies/score.py", line 129, in run
    tensor_a = torch.unsqueeze(tensor_a, dim=0).to("cuda")
  File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

The above exception was the direct cause of the following exception:

如果你认为我为什么使用“model = mlflow.pytorch.load_model(model_path,map_location=torch.device('cpu')”
请参考此论坛-https://learn.microsoft.com/en-us/answers/questions/1291498/facing-problem-while-deploying-model-on-azure-ml-a
文档-https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-mlflow-models-online-endpoints?view=azureml-api-2&tabs=sdk

evrscar2

evrscar21#

要解决此问题,您可以修改代码以确保Tensor加载到CPU而不是GPU上。在代码中添加设备变量:

device = torch.device("cuda"  if  torch.cuda.is_available() else  "cpu")

替换**run()**函数中的以下代码:

tensor_a = torch.tensor(y, dtype=torch.int32)
 tensor_a = torch.unsqueeze(tensor_a, dim=0).to("device")

 tensor_t = torch.tensor(x, dtype=torch.int32)
 tensor_t = torch.unsqueeze(tensor_t, dim=0).to("device")

下面是错误和修复的示例。
再现错误;

修复:

相关问题