tensorflow 为机器学习创建Docker容器

yjghlzjz  于 2023-01-31  发布在  Docker
关注(0)|答案(1)|浏览(145)

我不是Docker的Maven,我正在尝试为机器学习项目创建一个容器。这主要是出于学术目的,因为我正在研究 * 机器学习 *。
我写了一个dockerfile(和一个devcontainer.json文件,用于在vscode中打开容器),它运行良好,直到我添加行来构建tensorflow。我发现了三个问题,但我不知道我遗漏了什么:

  • 我收到一个警告,Bazel安装不是发布版本
  • 当构建阶段到达./configure时,我无法通过提示问题进行配置
  • 我在ERROR: The project you're trying to build requires Bazel 5.3.0 (specified in /docklearning/tensorflow/.bazelversion), but it wasn't found in /usr/bin.的构建阶段遇到错误

这是Docker文件:

FROM nvidia/cuda:12.0.0-runtime-ubuntu22.04 as base

ARG USER_UID=1000

#switch to non-interactive frontend
ENV DEBIAN_FRONTEND=noninteractive

WORKDIR /docklearning
ADD . /docklearning

# Install packages
RUN apt-get update -q && apt-get install -q -y --no-install-recommends \
    apt-transport-https curl gnupg apt-utils wget gcc g++ npm unzip build-essential ca-certificates curl git gh \
    make nano iproute2 nano openssh-client openssl procps \
    software-properties-common bzip2 subversion neofetch \
    fontconfig && \
    curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor >bazel-archive-keyring.gpg && \
    mv bazel-archive-keyring.gpg /usr/share/keyrings && \
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/bazel-archive-keyring.gpg] https://storage.googleapis.com/bazel-apt stable jdk1.8" \
    | tee /etc/apt/sources.list.d/bazel.list && \
    apt update && apt install -q -y bazel && \
    apt-get full-upgrade -q -y && \
    cd ~ && \
    wget https://github.com/ryanoasis/nerd-fonts/releases/download/v2.1.0/Meslo.zip && \
    mkdir -p .local/share/fonts && \
    unzip Meslo.zip -d .local/share/fonts && \
    cd .local/share/fonts && rm *Windows* && \
    cd ~ && \
    rm Meslo.zip && \
    fc-cache -fv && \
    apt-get install -y zsh zsh-doc chroma
    
# Install anaconda
RUN wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh -O Anaconda.sh && \
    /bin/bash Anaconda.sh -b -p /opt/conda && \
    rm Anaconda.sh

# Install LSD for ls substitute and clean up
RUN wget https://github.com/Peltoche/lsd/releases/download/0.23.1/lsd_0.23.1_amd64.deb -P /tmp && \
    dpkg -i /tmp/lsd_0.23.1_amd64.deb && \
    rm /tmp/lsd_0.23.1_amd64.deb && \
    apt-get autoremove -y && \
    apt-get autoclean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*/apt/lists/* && \
    useradd -r -m -s /bin/bash -u ${USER_UID} docklearning

# Add conda to PATH
ENV PATH=/opt/conda/bin:$PATH

ENV HOME=/home/docklearning

# Make zsh default shell
RUN chsh -s /usr/bin/zsh docklearning

# link conda to /etc/profile.d
RUN ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh

# Update Conda Base env
RUN conda update -n base --all -y && \
    conda install -c conda-forge bandit opt_einsum keras-preprocessing && \
    conda clean -a -q -y
     
# Download Tensorflow from github and build from source
RUN git clone https://github.com/tensorflow/tensorflow.git && \
    cd tensorflow && \
    ./configure && \
    bazel build --config=opt --config=cuda --cxxopt="-mavx2" //tensorflow/tools/pip_package:build_pip_package && \
    ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg && \
    pip install /tmp/tensorflow_pkg/tensorflow-*.whl && \
    cd .. && \
    rm -rf tensorflow

# Create shell history file
RUN mkdir ${HOME}/zsh_history && \
    chown docklearning ${HOME}/zsh_history && \
    mkdir ${HOME}/.ssh

# Switch to internal user
USER docklearning
WORKDIR ${HOME}

# Copy user configuration files
COPY --chown=docklearning ./config/.aliases.sh ./
COPY --chown=docklearning ./config/.bashrc ./
COPY --chown=docklearning ./config/.nanorc ./

# Configure Zsh for internal user
ENV ZSH=${HOME}/.oh-my-zsh
ENV ZSH_CUSTOM=${ZSH}/custom
ENV ZSH_PLUGINS=${ZSH_CUSTOM}/plugins
ENV ZSH_THEMES=${ZSH_CUSTOM}/themes
RUN wget -qO- https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh | zsh || true
RUN git clone --single-branch --branch 'master' --depth 1 https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_PLUGINS}/zsh-syntax-highlighting \
    && git clone --single-branch --branch 'master' --depth 1 https://github.com/zsh-users/zsh-autosuggestions ${ZSH_PLUGINS}/zsh-autosuggestions \
    && git clone --single-branch --depth 1 https://github.com/romkatv/powerlevel10k.git ${ZSH_THEMES}/powerlevel10k
COPY --chown=docklearning ./config/.p10k.zsh ./
COPY --chown=docklearning ./config/.zshrc ./

CMD [ "/bin/zsh" ]
e4yzc0pl

e4yzc0pl1#

我不会说这个问题是关于Docker或TensorFlow的,而是关于软件包安装和操作系统配置的。
在这种情况下,您需要首先通过显式指定bazel版本来修复它:

apt install -q -y bazel-5.3.0

但是这个包是完全连接的,并且不创建符号链接,所以您需要创建它:

ln -s /bin/bazel-5.3.0 /bin/bazel

您可以通过运行bazel来验证它是否已安装:

`bazel --version
 bazel 5.3.0`

相关问题