我正在用java代码尝试Hadoop2.6中的mapreduce程序。我试图引用其他关于堆栈溢出的文章,但调试代码失败。
首先让我描述一下记录的类型:subid=0000111911128052627towerid=11232w3453254345634623456984756894756bytes=12211221221221221221218.462170216543667e17 subid=0000111911128052639towerid=11232w3453254345634623456984756894756bytes=12211221221221219.67263121586e17subid=0000111911128052615towerid=11232W34532543456345623456984756894756bytes=12211221221221221221221216.9431647633139046e17 subid=0000111911128052615towerid=11232W34532543456345623456984756894756bytes=12211221221221221221214.7836041834471E17
现在Map器类:aircelmapper.class
import java.io.IOException;
import java.lang.String;
import java.lang.Long;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.io.*;
public class AircelMapper extends Mapper<LongWritable,Text,Text, LongWritable>
{
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException
{
String acquire=value.toString();
String st=acquire.substring(81, 84);
LongWritable bytes=new LongWritable(Long.parseLong(st));
context.write(new Text(acquire.substring(6, 26)), bytes);
}
}
现在驱动程序类:airceldriver.class
import java.io.IOException;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
public class AircelDriver
{
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException
{
if(args.length<2)
{ System.out.println(" type ip and op file correctly");
System.exit(-1);
}
Job job = Job.getInstance();
job.setJobName(" @@@@@@@@@@@@@@@ MY FIRST PROGRAM @@@@@@@@@@@@@@@");
job.setJarByClass(AircelDriver.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setMapperClass(AircelMapper.class);
job.setReducerClass(AircelReducer.class);
job.submit();
job.waitForCompletion(true);
}
}
我没有发布reducer类,因为问题是在运行时Map器代码中出现的。hadoop运行时的输出如下所示(本质上是作业失败的指示):
16/12/18 04:11:00 INFO mapred.LocalJobRunner: Starting task: attempt_local1618565735_0001_m_000000_0
16/12/18 04:11:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/12/18 04:11:01 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
16/12/18 04:11:01 INFO mapred.MapTask: Processing split: hdfs://quickstart.cloudera:8020/practice/Data_File.txt:0+1198702
16/12/18 04:11:01 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/12/18 04:11:01 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/12/18 04:11:01 INFO mapred.MapTask: soft limit at 83886080
16/12/18 04:11:01 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/12/18 04:11:01 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/12/18 04:11:01 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/12/18 04:11:01 INFO mapreduce.Job: Job job_local1618565735_0001 running in uber mode : false
16/12/18 04:11:01 INFO mapreduce.Job: map 0% reduce 0%
16/12/18 04:11:02 INFO mapred.MapTask: Starting flush of map output
16/12/18 04:11:02 INFO mapred.MapTask: Spilling map output
16/12/18 04:11:02 INFO mapred.MapTask: bufstart = 0; bufend = 290000; bufvoid = 104857600
16/12/18 04:11:02 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26174400(104697600); length = 39997/6553600
16/12/18 04:11:03 INFO mapred.MapTask: Finished spill 0
16/12/18 04:11:03 INFO mapred.LocalJobRunner: map task executor complete.
16/12/18 04:11:03 WARN mapred.LocalJobRunner: job_local1618565735_0001
****java.lang.Exception:**java.lang.StringIndexOutOfBoundsException: String index out of range: 84******
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 84
at java.lang.String.substring(String.java:1907)
at AircelMapper.map(AircelMapper.java:13)
at AircelMapper.map(AircelMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(Fut
为什么它会给出字符串索引越界异常?string类对字符串的大小有内部限制吗?我不明白mapper类中第13-15行的问题是什么。
1条答案
按热度按时间7hiiyaii1#
indexoutofboundsexception-如果beginindex为负,或者endindex大于此字符串对象的长度,或者beginindex大于endindex。
public stringindexoutofboundsexception(int index)构造一个新的stringindexoutofboundsexception类,其参数指示非法索引。-84(你的情况)
public stringindexoutofboundsexception(字符串s)使用指定的详细信息构造stringindexoutofboundsexception。-数组超出范围(在您的情况下)
检查索引84处的输入。