datastax企业:由于/hadoop/conf目录不可写而无法启动

k4emjkb1  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(352)

我遵循了datastax的指南,介绍了在docker中使用dse的最佳实践,但是在使用datastax提供的所有默认设置脚本和docker文件时遇到了以下错误。

错误日志

Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310) ~[dse-hadoop-5.0.3.jar:5.0.3]
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174) ~[dse-hadoop-5.0.3.jar:5.0.3]
        at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20) ~[dse-hadoop-5.0.3.jar:5.0.3]
        at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377) ~[dse-core-5.0.3.jar:5.0.3]
        at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306) ~[dse-core-5.0.3.jar:5.0.3]
        ... 7 common frames omitted
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveConfiguration(CassandraJobConf.java:466) ~[dse-hadoop-5.0.3.jar:5.0.3]
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveDseHadoopConfiguration(CassandraJobConf.java:345) ~[dse-hadoop-5.0.3.jar:5.0.3]
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:300) ~[dse-hadoop-5.0.3.jar:5.0.3]
        ... 11 common frames omitted
Unable to start DSE server: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
com.datastax.bdp.plugin.PluginManager$PluginActivationException: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
        at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:327)
        at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:259)
        at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:169)
        at com.datastax.bdp.plugin.PluginManager.preStart(PluginManager.java:77)
        at com.datastax.bdp.server.DseDaemon.preStart(DseDaemon.java:490)
        at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:462)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:583)
        at com.datastax.bdp.DseModule.main(DseModule.java:91)
Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310)
        at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174)
        at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20)
        at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377)
        at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306)
        ... 7 more
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf

这个错误非常直接,尝试通过添加一些额外的 chmod 电话 Dockerfile 无济于事。

dockerfile文件


# Provided without any warranty, these files are intended

# to accompany the whitepaper about DSE on Docker and are

# not intended for production and are not actively maintained.

# Loosely based on docker-cassandra by the fine folk at Spotify

# -- https://github.com/spotify/docker-cassandra/

# Loosely based on cassandra-docker by the one and only Al Tobey

# -- https://github.com/tobert/cassandra-docker/

# base yourself on any ubuntu 14.04 image containing JDK8

# official Docker Java images are distributed with OpenJDK

# Datastax certifies its product releases specifically

# on the Oracle/Sun JVM, so YMMV with OpenJDK

FROM nimmis/java:oracle-8-jdk

# Avoid ERROR: invoke-rc.d: policy-rc.d denied execution of start.

RUN echo "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d

RUN export DEBIAN_FRONTEND=noninteractive && \
    apt-get update && \
    apt-get -y install adduser \
    curl \
    lsb-base \
    procps \
    zlib1g \
    gzip \
    python \
    python-support \
    sysstat \
    ntp bash tree && \
    rm -rf /var/lib/apt/lists/*

# grab gosu for easy step-down from root

RUN curl -o /bin/gosu -SkL "https://github.com/tianon/gosu/releases/download/1.4/gosu-$(dpkg --print-architecture)" \
    && chmod +x /bin/gosu

# DSE tarball can be download into the folder where Dockerfile is

# wget --user=$USER --password=$PASS http://downloads.datastax.com/enterprise/dse-5.0.0-bin.tar.gz

# you may want to replace dse-5.0.0-bin.tar.gz with the corresponding downloaded package name. When

# downloaded, please remove the version number part of the filename (or create a symlink), so the

# resulting file is named dse-bin.tar.gz (that way the docker file itself remains version independent).

# 

# DataStax Agent debian package can be downloaded from

# wget --user=$USER --password=$PASS http://debian.datastax.com/enterprise/pool/datastax-agent_6.0.0_all.deb

# you may want to replace the specific version with the corresponding downloaded package name. When

# downloaded, please remove the version number part of the filename (or create a symlink), so the

# resulting file is named datastax-agent_all.deb (that way the docker file itself remains version

# independent).

ADD dse.tar.gz /opt
ADD datastax-agent_all.deb /tmp

ENV DSE_HOME /opt/dse

RUN ln -s /opt/dse* $DSE_HOME

# keep data here

VOLUME /data

# and logs here

VOLUME /logs

VOLUME /opt/dse

# create a dedicated user for running DSE node

RUN groupadd -g 1337 cassandra && \
    useradd -u 1337 -g cassandra -s /bin/bash -d $DSE_HOME cassandra && \
    chown -R cassandra:cassandra /opt/dse* 

RUN chmod r+w -R /opt/dse/

# install the agent

RUN dpkg -i /tmp/datastax-agent_all.deb

# starting node using custom entrypoint that configures paths, interfaces, etc.

COPY scripts/dse-entrypoint /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-entrypoint
ENTRYPOINT ["/usr/local/bin/dse-entrypoint"]

# Running any other DSE/C* command should be done on behalf dse user

# Perform that using a generic command laucher

COPY scripts/dse-cmd-launcher /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-cmd-launcher

# link dse commands to the launcher

RUN for cmd in cqlsh dsetool nodetool dse cassandra-stress; do \
        ln -sf /usr/local/bin/dse-cmd-launcher /usr/local/bin/$cmd ; \
    done

# the detailed list of ports

# http://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/sec/secConfFirePort.html

# Cassandra

EXPOSE 7000 9042 9160

# Solr

EXPOSE 8983 8984

# Spark

EXPOSE 4040 7080 7081 7077

# Hadoop

EXPOSE 8012 50030 50060 9290

# Hive/Shark

EXPOSE 10000

# Graph

解决这个问题的最后一个答案可能是在这个容器启动时用于实际启动dse的启动脚本。

dse启动脚本(由docker容器在启动时调用)


# !/bin/sh

# Provided without any warranty, these files are intended

# to accompany the whitepaper about DSE on Docker and are

# not intended for production and are not actively maintained.

# Bind the various services

# These should be updated on every container start

if [ -z ${IP} ]; then
  IP=`hostname --ip-address`
fi

echo $IP > /data/ip.address

# create directories for holding the node's data, logs, etc.

create_dirs() {
  local base_dir=$1;

  mkdir -p $base_dir/data/commitlog
  mkdir -p $base_dir/data/saved_caches
  mkdir -p $base_dir/data/hints
  mkdir -p $base_dir/logs
}

# tweak the cassandra config

tweak_cassandra_config() {
  env="$1/cassandra-env.sh"
  conf="$1/cassandra.yaml"

  base_data_dir="/data"

  # Set the cluster name
  if [ -z "${CLUSTER_NAME}" ]; then
    printf " - No cluster name provided; skipping.\n"
  else
    printf " - Setting up the cluster name: ${CLUSTER_NAME}\n"
    regexp="s/Test Cluster/${CLUSTER_NAME}/g"
    sed -i -- "$regexp" $conf
  fi

  # Set the commitlog directory, and various other directories
  # These are done only once since the regexep matches will fail on subsequent
  # runs.
  printf " - Setting up directories\n"
  regexp="s|/var/lib/cassandra/|$base_data_dir/|g"
  sed -i -- "$regexp" $conf
  regexp="s/^listen_address:.*/listen_address: ${IP}/g"
  sed -i -- "$regexp" $conf
  regexp="s/rpc_address:.*/rpc_address: ${IP}/g"
  sed -i -- "$regexp" $conf

  # seeds
  if [ -z "${SEEDS}" ]; then
    printf " - Using own IP address ${IP} as seed.\n";
    regexp="s/seeds:.*/seeds: \"${IP}\"/g";
  else
    printf " - Using seeds: $SEEDS\n";
    regexp="s/seeds:.*/seeds: \"${IP},${SEEDS}\"/g"
  fi
  sed -i -- "$regexp" $conf

  # JMX
  echo "JVM_OPTS=\"\$JVM_OPTS -Djava.rmi.server.hostname=127.0.0.1\"" >> $env
}

tweak_dse_in_sh() {
  # point C* logs dir to the created volume
  sed -i -- "s|/var/log/cassandra|/logs|g" "$1/dse.in.sh"
}

tweak_spark_config() {
  sed -i -- "s|/var/lib/spark/|/data/spark/|g" "$1/spark-env.sh"
  sed -i -- "s|/var/log/spark/|/logs/spark/|g" "$1/spark-env.sh"
  mkdir -p /data/spark/worker
  mkdir -p /data/spark/rdd
  mkdir -p /logs/spark/worker
}

tweak_agent_config() {
  [ -d "/var/lib/datastax-agent" ] && cat > /var/lib/datastax-agent/conf/address.yaml <<EOF
stomp_interface: ${STOMP_INTERFACE}
use_ssl: 0
local_interface: ${IP}
hosts: ["${IP}"]
cassandra_install_location: /opt/dse
cassandra_log_location: /logs
EOF
  chown cassandra:cassandra /var/lib/datastax-agent/conf/address.yaml
}

setup_node() {
  printf "* Setting up node...\n"
  printf " + Setting up node...\n"

  create_dirs
  tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
  tweak_dse_in_sh "$DSE_HOME/bin"
  tweak_spark_config "$DSE_HOME/resources/spark/conf"
  tweak_agent_config
  chown -R cassandra:cassandra /data /logs /conf

  # mark that we tweaked configs
  touch "$DSE_HOME/tweaked_configs"

  printf "Done.\n"
}

# if marker file doesn't exist, setup node

[ ! -f "$DSE_HOME/tweaked_configs" ] && setup_node

[ -f "/etc/init.d/datastax-agent" ] && /etc/init.d/datastax-agent start

exec gosu cassandra "$DSE_HOME/bin/dse" cassandra -f "$@"

docker容器命令行参数

下面是我用来通过docker启动单个dse示例的命令行参数:


# !/bin/bash

# Used to start a single DSE node that has both Spark and Cassandra running on it

OPSC_CONTAINER=$1

if [ -z "$OPSC_CONTAINER" ]; then
  echo "usage: start_docker_cluster.sh OPSCContainerName"
  echo "  OPSCContainerName   mandatory name of the container running OpsCenter"
  exit 1
fi

[ -z "$CLUSTER_NAME" ] && CLUSTER_NAME="Test_Cluster"

STOMP_INTERFACE=`docker exec $OPSC_CONTAINER hostname -I`
docker run -p 7080:7080 -p 4040:4040 -p 7077:7077 -p 9042:9042 --link $OPSC_CONTAINER -d -e CLUSTER_NAME="$CLUSTER_NAME" -e STOMP_INTERFACE="$STOMP_INTERFACE" --name dse dse -k -t

这个 -k -t 标志表示我们将为此容器同时启动hadoop和spark。我把钥匙掉了 -t 即使没有它,仍然会发生此配置错误。
我需要做些什么才能使 /opt/dse/resources/hadoop/conf 目录可写以便dse可以成功引导?

e5nqia27

e5nqia271#

这样做:
我加上了chown-rhhcassandra:cassandra /opt/dse 在dse启动脚本的setup_node()部分(由docker容器在启动时调用)
麦克斯的回答对我有用,但我得到的不是他的问题

Unable to activate plugin com.datastax.bdp.plugin.DseFsPlugin
(...)
java.io.IOException: Failed to create work directory: /var/lib/dsefs

所以我不得不把我的setup\u node()变成这个

setup_node() {
  printf "* Setting up node...\n"
  printf " + Setting up node...\n"

  create_dirs
  tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
  tweak_dse_in_sh "$DSE_HOME/bin"
  tweak_spark_config "$DSE_HOME/resources/spark/conf"
  tweak_agent_config
  chown -R cassandra:cassandra /data /logs /conf

  mkdir /var/lib/dsefs
  chown -RHh cassandra:cassandra /opt/dse /var/lib/dsefs

  # mark that we tweaked configs
  touch "$DSE_HOME/tweaked_configs"

  printf "Done.\n"
}
muk1a3rh

muk1a3rh2#

添加“chown-rhh”cassandra:cassandra /opt/dse'到entrypoint脚本解决了我无法写入/opt/dse/resources/hadoop/conf的问题。
重新。错误04:15:04789 spark worker日志记录。scala:74 - 无法创建工作目录/var/lib/spark/worker
检查spark-env.sh,并查看目录Map。在我的例子中,我装载了两个外部卷-/data和/logs。这两个目录都属于cassandra:cassandra.


# This is a base directory for Spark Worker work files.

if [ "x$SPARK_WORKER_DIR" = "x" ]; then
    export SPARK_WORKER_DIR="/data/spark/worker"
fi

if [ "x$SPARK_LOCAL_DIRS" = "x" ]; then
    export SPARK_LOCAL_DIRS="/data/spark/rdd"
fi

# This is a base directory for Spark Worker logs.

if [ "x$SPARK_WORKER_LOG_DIR" = "x" ]; then
   export SPARK_WORKER_LOG_DIR="/logs/spark/worker"
fi

# This is a base directory for Spark Master logs.

if [ "x$SPARK_MASTER_LOG_DIR" = "x" ]; then
   export SPARK_MASTER_LOG_DIR="/logs/spark/master"
fi

此视频显示在docker上运行的全功能dse enterprise:https://vimeo.com/181393134

euoag5mw

euoag5mw3#

我补充道 chown -RHh cassandra:cassandra /opt/dsesetup_node() dse启动脚本的一部分(在启动时由docker容器调用),它修复了这个问题。退房 chown --help 有关这些选项的更多信息。
注意:我现在得到一个 ERROR 04:15:04,789 SPARK-WORKER Logging.scala:74 - Failed to create work directory /var/lib/spark/worker 稍后,但至少我的修复会让你度过最初的问题。

setup_node() {
  printf "* Setting up node...\n"
  printf " + Setting up node...\n"

  create_dirs
  tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
  tweak_dse_in_sh "$DSE_HOME/bin"
  tweak_spark_config "$DSE_HOME/resources/spark/conf"
  tweak_agent_config
  tweak_dse_config "$DSE_HOME/resources/dse/conf"
  chown -R cassandra:cassandra /data /logs /conf

  chown -RHh cassandra:cassandra /opt/dse

  # mark that we tweaked configs
  touch "$DSE_HOME/tweaked_configs"

  printf "Done.\n"
}

相关问题