Torch识别GPU但不识别Tensorflow

os8fio9y  于 2023-08-06  发布在  其他
关注(0)|答案(1)|浏览(235)

我有一个运行Flask应用程序的docker容器,我同时使用tensorflow和pytorch。在torch中我可以使用GPU,但在Tensorflow中不行。
nvidia-smi输出:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                        Off| 00000000:00:04.0 Off |                    0 |
| N/A   60C    P0               29W /  70W|   1146MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

字符串
我不明白为什么nvidia-smi显示了一个cuda版本,但nvcc不工作,我不能在python:3.9-slim docker镜像中使用apt安装cuda工具包。
nvcc --version输出:

bash: nvcc: command not found


import tensorflow输出:

2023-07-01 21:12:51.765379: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-01 21:12:51.814111: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-01 21:12:51.814886: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-01 21:12:53.284879: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT


torch输出:

>>> import torch
>>> torch.cuda.is_available()
True
>>>


Dockerfile

ARG PYTHON_VERSION=3.9
FROM python:${PYTHON_VERSION}-slim as base
ENV PYTHONDONTWRITEBYTECODE=1

ENV PYTHONUNBUFFERED=1

WORKDIR /app

RUN apt-get update && apt-get install -y ffmpeg

RUN --mount=type=cache,target=/root/.cache/pip \
    --mount=type=bind,source=requirements.txt,target=requirements.txt \
    python -m pip install -r requirements.txt
COPY . .
EXPOSE 80
CMD gunicorn 'main:app' --bind=0.0.0.0:80 --timeout=36000000 --workers=1 --threads=8


compose.yaml

services:
  server:
    build:
      context: .
    ports:
      - 80:80
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]


你能帮我解决这个问题吗?

ee7vknir

ee7vknir1#

TensorFlow的安装页面指出需要NVIDIA® GPU drivers version 450.80.02 or higherCUDA® Toolkit 11.8cuDNN SDK 8.6.0TensorRT的安装是可选的,但可以提高延迟和吞吐量。
您的映像不包含任何内容,因为您没有在Dockerfile中执行TensorFlow安装页面上提到的任何步骤。你还没有发布你的requirements.txt
可以考虑直接使用TensorFlow (Dockerhub)/TensorFlow (NVIDIA) docker镜像。

相关问题