这个问题简直让我发疯。我在本地局域网上运行了一个单机风暴示例。我正在跑步 v0.9.1-incubating
发布版本(来自apache孵化器站点。问题很简单,我的 storm supervisor
进程拒绝在每次重新启动后启动。黑客修复非常简单,删除 supervisor
以及 workers
从storm文件夹本地目录重新运行进程;一切都很顺利,直到下次重启。
我提供了我认为可能与调试此问题相关的所有信息。如果需要的话,请要求更多,但请帮我解决一些问题。
ps:不管我是否运行拓扑。
zookeeper版本:3.4.5
风暴版:0.9.1-孵化(使用网状运输)
storm和zookeeper在同一台机器上运行。
监管者版本:3.0b2
操作系统:ubuntu 12.04 lts
处理器:amd phenom(tm)ii x6 1055t处理器× 6
闸板:5.6 gib
主管配置
[program:zookeeper]
command=/path/to/zookeeper/bin/zkServer.sh "start-foreground"
process_name=zookeeper
directory=/path/to/zookeeper/bin
stdout_logfile=/var/log/zookeeper.log ; stdout log path, NONE$
stderr_logfile=/var/log/err.zookeeper.log ; stderr log path, $
priority=2
user=root
[program:storm-nimbus]
command=/path/to/storm/bin/storm nimbus
user=root
autostart=true
autorestart=true
startsecs=10
startretries=2
log_stdout=true
log_stderr=true
stderr_logfile=/var/log/storm/nimbus.err.log
stdout_logfile=/var/log/storm/nimbus.out.log
logfile_maxbytes=20MB
logfile_backups=2
priority=10
[program:storm-ui]
command=/path/to/storm/bin/storm ui
user=root
autostart=true
autorestart=true
startsecs=10
startretries=2
log_stdout=true
log_stderr=true
stderr_logfile=/var/log/storm/ui.err.log
stdout_logfile=/var/log/storm/ui.out.log
logfile_maxbytes=20MB
logfile_backups=2
priority=500
[program:storm-supervisor]
command=/path/to/storm/bin/storm supervisor
user=root
autostart=true
autorestart=true
startsecs=10
startretries=2
log_stdout=true
log_stderr=true
stderr_logfile=/var/log/storm/supervisor.err.log
stdout_logfile=/var/log/storm/supervisor.log.log
logfile_maxbytes=20MB
logfile_backups=2
priority=600
[program:storm-logviewer]
command=/path/to/storm/bin/storm logviewer
user=root
autostart=true
autorestart=true
startsecs=10
startretries=2
log_stdout=true
log_stderr=true
stderr_logfile=/var/log/storm/log.err.log
stdout_logfile=/var/log/storm/log.out.log
logfile_maxbytes=20MB
logfile_backups=2
priority=900
风暴形态
# Zookeeper
storm.zookeeper.servers:
- "192.168.1.11"
# Nimbus
nimbus.host: "192.168.1.11"
nimbus.childopts: '-Xmx1024m -Djava.net.preferIPv4Stack=true -Dprocess=storm'
# UI
ui.port: 9090
ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true -Dprocess=storm"
# Supervisor
supervisor.childopts: '-Djava.net.preferIPv4Stack=true -Dprocess=storm'
# Worker
worker.childopts: '-Xmx768m -Djava.net.preferIPv4Stack=true -Dprocess=storm'
storm.local.dir: "/path/to/storm"
storm.messaging.transport: "backtype.storm.messaging.netty.Context"
storm.messaging.netty.server_worker_threads: 1
storm.messaging.netty.client_worker_threads: 1
storm.messaging.netty.buffer_size: 5242880
storm.messaging.netty.max_retries: 100
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100
错误消息
日志错误消息的粘贴箱。我在这里交叉发布相关信息。
java.lang.RuntimeException: java.io.EOFException
at backtype.storm.utils.Utils.deserialize(Utils.java:86) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at backtype.storm.utils.LocalState.snapshot(LocalState.java:45) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at backtype.storm.utils.LocalState.get(LocalState.java:56) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:207) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at clojure.lang.AFn.applyToHelper(AFn.java:161) [clojure-1.4.0.jar:na]
at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
at clojure.core$apply.invoke(core.clj:603) ~[clojure-1.4.0.jar:na]
at clojure.core$partial$fn__4070.doInvoke(core.clj:2343) ~[clojure-1.4.0.jar:na]
at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.4.0.jar:na]
at backtype.storm.event$event_manager$fn__2593.invoke(event.clj:39) ~[na:na]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
at java.lang.Thread.run(Thread.java:679) [na:1.6.0_27]
Caused by: java.io.EOFException: null
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2322) ~[na:1.6.0_27]
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2791) ~[na:1.6.0_27]
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:798) ~[na:1.6.0_27]
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:298) ~[na:1.6.0_27]
at backtype.storm.utils.Utils.deserialize(Utils.java:81) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
... 11 common frames omitted
2014-03-11 12:27:25 b.s.util [INFO] Halting process: ("Error when processing an event")
2条答案
按热度按时间7ivaypg91#
当我们的两台开发服务器断电时,我们遇到了完全相同的问题(启动时主管崩溃和相同的日志错误消息)。我想只是停止服务器而不停止管理器也会有同样的效果。
我们找到的唯一可行的解决方案是删除“storm local/supervisor”文件夹(我猜里面有些东西被破坏了)。
wj8zmpe12#
我也面临类似的问题。我总是删除本地文件夹并重新启动拓扑。