mahout:无法转换为序列文件

xbp102n0  于 2021-06-04  发布在  Hadoop
关注(0)|答案(2)|浏览(528)

我想把一些文本文件转换成mahout序列文件。我也是

  1. mahout seqdirectory -i inputFolder -o outputFolder

但我总是有例外

  1. java.lang.Exception: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
  2. at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
  3. Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
  4. at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
  5. at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126)
  6. at org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
  7. at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491)
  8. at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
  9. at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
  10. at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
  11. at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  12. at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  13. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  14. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  15. at java.lang.Thread.run(Thread.java:744)
  16. Caused by: java.lang.reflect.InvocationTargetException
  17. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  18. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  19. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  20. at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  21. at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
  22. ... 11 more
  23. Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
  24. at org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52)
  25. ... 16 more

我正在hadoop2.2.0上运行mahout 0.8
有什么想法吗?

lyfkaqu1

lyfkaqu11#

当我读到mahout 0.8与hadoop1.2一起工作时。我只下载了mahout(它使用lib/hadoop中的hadoopjar)

k2fxgqgv

k2fxgqgv2#

前面的答案不正确。mahout 0.8有一个seqdirectory的mapreduce版本,这是一个新特性。mr版本中的一个bug导致了您看到的异常。
要使用mahout 0.8执行seqdirectory,请通过在命令行中指定-xm sequential选项来使用sequential版本。

  1. mahout seqdirectory -i inputFolder -o outputFolder -xm sequential

默认情况下,如果没有指定,seqdirectory将执行mr版本。
这个问题已经在mahout 0.9中修复了。

相关问题