mapreduce hadoop运行时字符串异常

vd2z7a6w  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(325)

我正在用java代码尝试Hadoop2.6中的mapreduce程序。我试图引用其他关于堆栈溢出的文章,但调试代码失败。
首先让我描述一下记录的类型:subid=0000111911128052627towerid=11232w3453254345634623456984756894756bytes=12211221221221221221218.462170216543667e17 subid=0000111911128052639towerid=11232w3453254345634623456984756894756bytes=12211221221221219.67263121586e17subid=0000111911128052615towerid=11232W34532543456345623456984756894756bytes=12211221221221221221221216.9431647633139046e17 subid=0000111911128052615towerid=11232W34532543456345623456984756894756bytes=12211221221221221221214.7836041834471E17
现在Map器类:aircelmapper.class

import java.io.IOException;
import java.lang.String;
import java.lang.Long;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.io.*;
public class AircelMapper extends Mapper<LongWritable,Text,Text, LongWritable>
{

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException
    {

        String acquire=value.toString();
        String st=acquire.substring(81, 84);

        LongWritable bytes=new LongWritable(Long.parseLong(st));
        context.write(new Text(acquire.substring(6, 26)), bytes);
    }
}

现在驱动程序类:airceldriver.class

import java.io.IOException;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;

public class AircelDriver
{
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException
    {
        if(args.length<2)
        {   System.out.println(" type ip and op file correctly");
            System.exit(-1);
        }

        Job job = Job.getInstance();

        job.setJobName(" @@@@@@@@@@@@@@@       MY FIRST PROGRAM        @@@@@@@@@@@@@@@");

        job.setJarByClass(AircelDriver.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);
        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setMapperClass(AircelMapper.class);
        job.setReducerClass(AircelReducer.class);
        job.submit();
        job.waitForCompletion(true);

    }
}

我没有发布reducer类,因为问题是在运行时Map器代码中出现的。hadoop运行时的输出如下所示(本质上是作业失败的指示):

16/12/18 04:11:00 INFO mapred.LocalJobRunner: Starting task: attempt_local1618565735_0001_m_000000_0
16/12/18 04:11:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/12/18 04:11:01 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/12/18 04:11:01 INFO mapred.MapTask: Processing split: hdfs://quickstart.cloudera:8020/practice/Data_File.txt:0+1198702
16/12/18 04:11:01 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/12/18 04:11:01 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/12/18 04:11:01 INFO mapred.MapTask: soft limit at 83886080
16/12/18 04:11:01 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/12/18 04:11:01 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/12/18 04:11:01 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/12/18 04:11:01 INFO mapreduce.Job: Job job_local1618565735_0001 running in uber mode : false
16/12/18 04:11:01 INFO mapreduce.Job:  map 0% reduce 0%
16/12/18 04:11:02 INFO mapred.MapTask: Starting flush of map output
16/12/18 04:11:02 INFO mapred.MapTask: Spilling map output
16/12/18 04:11:02 INFO mapred.MapTask: bufstart = 0; bufend = 290000; bufvoid = 104857600
16/12/18 04:11:02 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26174400(104697600); length = 39997/6553600
16/12/18 04:11:03 INFO mapred.MapTask: Finished spill 0
16/12/18 04:11:03 INFO mapred.LocalJobRunner: map task executor complete.
16/12/18 04:11:03 WARN mapred.LocalJobRunner: job_local1618565735_0001

****java.lang.Exception:**java.lang.StringIndexOutOfBoundsException: String index out of range: 84******

    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 84
    at java.lang.String.substring(String.java:1907)
    at AircelMapper.map(AircelMapper.java:13)
    at AircelMapper.map(AircelMapper.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(Fut

为什么它会给出字符串索引越界异常?string类对字符串的大小有内部限制吗?我不明白mapper类中第13-15行的问题是什么。

7hiiyaii

7hiiyaii1#

indexoutofboundsexception-如果beginindex为负,或者endindex大于此字符串对象的长度,或者beginindex大于endindex。
public stringindexoutofboundsexception(int index)构造一个新的stringindexoutofboundsexception类,其参数指示非法索引。-84(你的情况)
public stringindexoutofboundsexception(字符串s)使用指定的详细信息构造stringindexoutofboundsexception。-数组超出范围(在您的情况下)
检查索引84处的输入。

相关问题