错误:java堆空间在reducer阶段

pvcm50d1  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(423)

我在reducer阶段遇到了java堆空间错误,我在应用程序中使用了41 reducer,还自定义了partitioner类。下面是我的代码,抛出下面的错误。

  1. 17/02/12 05:26:45 INFO mapreduce.Job: map 98% reduce 0%
  2. 17/02/12 05:28:02 INFO mapreduce.Job: map 100% reduce 0%
  3. 17/02/12 05:28:09 INFO mapreduce.Job: map 100% reduce 17%
  4. 17/02/12 05:28:10 INFO mapreduce.Job: map 100% reduce 39%
  5. 17/02/12 05:28:11 INFO mapreduce.Job: map 100% reduce 46%
  6. 17/02/12 05:28:12 INFO mapreduce.Job: map 100% reduce 51%
  7. 17/02/12 05:28:13 INFO mapreduce.Job: map 100% reduce 54%
  8. 17/02/12 05:28:14 INFO mapreduce.Job: map 100% reduce 56%
  9. 17/02/12 05:28:15 INFO mapreduce.Job: map 100% reduce 88%
  10. 17/02/12 05:28:16 INFO mapreduce.Job: map 100% reduce 90%
  11. 17/02/12 05:28:18 INFO mapreduce.Job: map 100% reduce 93%
  12. 17/02/12 05:28:18 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000020_0, Status : FAILED
  13. Error: Java heap space
  14. 17/02/12 05:28:19 INFO mapreduce.Job: map 100% reduce 91%
  15. 17/02/12 05:28:20 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000021_0, Status : FAILED
  16. Error: Java heap space
  17. 17/02/12 05:28:22 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000027_0, Status : FAILED
  18. Error: Java heap space
  19. 17/02/12 05:28:23 INFO mapreduce.Job: map 100% reduce 89%
  20. 17/02/12 05:28:24 INFO mapreduce.Job: map 100% reduce 90%
  21. 17/02/12 05:28:24 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000029_0, Status : FAILED
  22. Error: Java heap space

这是我的代码。。

  1. public class MyReducer extends Reducer<NullWritable, Text, NullWritable, Text> {
  2. private Logger logger = Logger.getLogger(MyReducer.class);
  3. StringBuilder sb = new StringBuilder();
  4. private MultipleOutputs<NullWritable, Text> multipleOutputs;
  5. public void setup(Context context) {
  6. logger.info("Inside Reducer.");
  7. multipleOutputs = new MultipleOutputs<NullWritable, Text>(context);
  8. }
  9. @Override
  10. public void reduce(NullWritable Key, Iterable<Text> values, Context context)
  11. throws IOException, InterruptedException {
  12. for (Text value : values) {
  13. final String valueStr = value.toString();
  14. if (valueStr.contains("Japan")) {
  15. sb.append(valueStr.substring(0, valueStr.length() - 20));
  16. } else if (valueStr.contains("SelfSourcedPrivate")) {
  17. sb.append(valueStr.substring(0, valueStr.length() - 29));
  18. } else if (valueStr.contains("SelfSourcedPublic")) {
  19. sb.append(value.toString().substring(0, valueStr.length() - 29));
  20. } else if (valueStr.contains("ThirdPartyPrivate")) {
  21. sb.append(valueStr.substring(0, valueStr.length() - 25));
  22. }
  23. }
  24. multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), "MyFileName");
  25. }
  26. public void cleanup(Context context) throws IOException, InterruptedException {
  27. multipleOutputs.close();
  28. }
  29. }

你能建议一些能解决我问题的改变吗。如果我们使用combiner类,它会改进吗?

mwg9r5ms

mwg9r5ms1#

我终于设法解决了。
我刚用过 multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName); 在for循环中,这解决了我的问题。我用非常大的数据集19gb文件对它进行了测试,对我来说效果很好。这是我的最终解决方案。最初我认为它可能会创建许多对象,但它对我来说工作得很好。map reduce的竞争也很快。

  1. @Override
  2. public void reduce(NullWritable Key, Iterable<Text> values, Context context)
  3. throws IOException, InterruptedException {
  4. for (Text value : values) {
  5. final String valueStr = value.toString();
  6. StringBuilder sb = new StringBuilder();
  7. if (valueStr.contains("Japan")) {
  8. sb.append(valueStr.substring(0, valueStr.length() - 20));
  9. } else if (valueStr.contains("SelfSourcedPrivate")) {
  10. sb.append(valueStr.substring(0, valueStr.length() - 24));
  11. } else if (valueStr.contains("SelfSourcedPublic")) {
  12. sb.append(value.toString().substring(0, valueStr.length() - 25));
  13. } else if (valueStr.contains("ThirdPartyPrivate")) {
  14. sb.append(valueStr.substring(0, valueStr.length() - 25));
  15. }
  16. multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),
  17. strName);
  18. }
  19. }
展开查看全部

相关问题