无法在Docker容器内加载动态库'libcudart.so.11.0'

kd3sttzy  于 2022-11-03  发布在  Docker
关注(0)|答案(1)|浏览(859)

我已经使用以下Dockerfile创建了一个Docker映像

FROM python:latest
WORKDIR /root/my_dir
COPY requirements.txt ./
RUN apt-get update &&\
        apt-get upgrade -y &&\
        apt-get install -y curl &&\
        apt-get install nano &&\
        pip3 install -r requirements.txt &&\

然后将其作为服务在docker-compose.yaml中进行描述

version: '2.2'

services:
  my_service:
    image: my_image
    volumes:
      - /root/my_dir:/root/my_dir
      - /usr/local/cuda-11.0:/usr/local/cuda-11.0
    environment:
      - LD_LIBRARY_PATH=/usr/local/cuda-11.0/targets/x86_64-linux/lib
    command: ["python3"]
    stdin_open: true
    tty: true

  cuda:
    image: nvidia/cuda:11.0.3-devel-ubuntu16.04
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 0
              capabilities: [gpu]
              driver: nvidia

版本:

Docker version 20.10.7, build f0df350
docker-compose version 1.29.0, build 07737305
docker-ce 5:20.10.7~3-0~ubuntu-xenial                  
docker-ce-cli 5:20.10.7~3-0~ubuntu-xenial
docker-ce-rootless-extras 5:20.10.7~3-0~ubuntu-xenial
docker-scan-plugin 0.8.0~ubuntu-xenial
nvidia-docker2 2.11.0-1

在我的机器上,我已经安装了cuda 11.0,并使用了python 3.8和tensorflow 2.4.0,如此处所述

https://www.tensorflow.org/install/source#gpu

我运行容器:
Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory;
看起来一切都很顺利,但当我连接到容器时,我尝试用Python导入tensorflow,但它给了我
Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory;
我还尝试设置与容器外部相同的env变量

LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64
CUDA_HOME=/usr/local/cuda-11.0
PATH=$PATH:/usr/local/cuda-11.0/bin

但它有任何效果。
我也尝试了几个cuda Docker 的形象,没有什么可做的。

yhxst69z

yhxst69z1#

问题是你在docker中创建的两个不同的服务不能相互通信。
最好的方法是使用CUDA创建一个映像,在其中安装python和requirememnts.txt。
这是一个docker文件的示例:

FROM nvidia/cuda:11.2.0-cudnn8-runtime-ubuntu20.04
WORKDIR /root/my_dir
COPY requirements.txt ./

RUN apt-get update && apt-get upgrade -y &&\
    apt-get install -y python3 python3-pip nano

RUN pip3 install --upgrade pip && pip3 install -r requirements.txt

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

这是 Docker 的作品

version: '2.2'

services:
  my_service:
    build: .
    volumes:
      - /root/my_dir:/root/my_dir
    command: ["bash"]
    stdin_open: true
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
              driver: nvidia

相关问题