flink作业创建rocksdb示例失败

dsf9zpds  于 2021-06-21  发布在  Flink
关注(0)|答案(1)|浏览(831)

我在flink上运行了很多作业,后端使用rocksdb,我的一个作业出现错误,整夜重启,
错误消息如下:

  1. java.lang.IllegalStateException: Could not initialize keyed state backend.
  2. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:330)
  3. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:221)
  4. at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:679)
  5. at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:666)
  6. at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
  7. at org.apache.flink.runtime.taskmanager.Task.run(Task.java:708)
  8. at java.lang.Thread.run(Thread.java:745)
  9. Caused by: java.io.IOException: Error while opening RocksDB instance.
  10. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.openDB(RocksDBKeyedStateBackend.java:1063)
  11. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.access$3300(RocksDBKeyedStateBackend.java:128)
  12. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBIncrementalRestoreOperation.restoreInstance(RocksDBKeyedStateBackend.java:1472)
  13. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBIncrementalRestoreOperation.restore(RocksDBKeyedStateBackend.java:1569)
  14. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.restore(RocksDBKeyedStateBackend.java:996)
  15. at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(StreamTask.java:775)
  16. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:319)
  17. Caused by: org.rocksdb.RocksDBException: Corruption: Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008919.sst. Size recorded in manifest 132174005, actual size 2674688
  18. Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008626.sst. Size recorded in manifest 111956797, actual size 14286848
  19. Sst file size mismatch: /mnt/dfs/3/hadoop/yarn/local/usercache/sloth/appcache/application_1526888270443_0002/flink-io-84ec9962-f37f-4fbc-8262-a215984d8d70/job-1a72a5f09ac8a80914256306363505aa_op-CoStreamFlatMap_1361_4_uuid-0b019d7f-2d28-44dc-baf2-12774ed3518f/db/008959.sst. Size recorded in manifest 43157714, actual size 933888
  20. at org.rocksdb.TtlDB.openCF(Native Method)
  21. at org.rocksdb.TtlDB.open(TtlDB.java:132)
  22. at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.openDB(RocksDBKeyedStateBackend.java:1054)
  23. ... 12 more

当我发现这个,它手动杀死它,然后重新启动。然后它工作得很好。
这个错误是怎么发生的,我找不到任何来自谷歌或其他地方的消息

slsn1g29

slsn1g291#

我在这个异常之前发现了错误。

  1. No space left on device

我认为这个问题要以这个问题为代价

相关问题