我有一个像制表符分隔符的文本文件
20001204X00000 Accident 10 9 6 Hyd
20001204X00001 Accident 8 7 vzg 2
20001204X00002 Accident 10 7 sec 1
20001204X00003 Accident 23 9 kkd 23
我想得到输出的航班号,总乘客数,这里我要把所有数字列的值加起来,总乘客数是这样的
20001204X00000 25
20001204X00001 17
20001204X00002 18
20001204X00003 55
当尝试添加四个数字列时,我得到了nullpointer异常,请帮助如何避免nullpointerexception以及如何用零替换null或空格值
实际上这是hadoop map reduce java代码
package com.flightsdamage.mr;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class FlightsDamage {
public static class FlightsMaper extends Mapper<LongWritable, Text, Text, LongWritable> {
LongWritable pass2;
@Override
protected void map(LongWritable key, Text value,
org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException,NumberFormatException,NullPointerException {
String line = value.toString();
String[] column=line.split("|");
Text word=new Text();
word.set(column[0]);
String str = "n";
try {
long a = Long.parseLong(str);
long a1=Long.parseLong("col1[]");
long a2=Long.parseLong("col2[]");
long a3=Long.parseLong("col3[]");
long a4=Long.parseLong("col4[]");
long sum = a1+a2+a3+a4;
LongWritable pass0 = new LongWritable(a1);
LongWritable pass = new LongWritable(a2);
LongWritable pass1 = new LongWritable(a3);
LongWritable pass3 = new LongWritable(a4);
pass2 = new LongWritable(sum);
} catch (Exception e) {
// TODO: handle exception
}finally{
context.write(word,pass2);
}
}
}
public static void main(String[] args)throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "Flights MR");
job.setJarByClass(FlightsDamage.class);
job.setMapperClass(FlightsMaper.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
//FileInputFormat.addInputPath(job, new Path("/home/node1/data-AviationData.txt"));
FileInputFormat.addInputPath(job, new Path("/home/node1/Filghtdamage.txt"));
FileOutputFormat.setOutputPath(job, new Path("/home/node1/output"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
1条答案
按热度按时间atmip9wb1#
在解析字符串之前,需要检查它是否为数字类型。比如:
如果输入字符串是非数字的(可以是null或其他非数字值),stringutils.isnumeric()将返回false,并且变量将具有
0
作为默认值。下面是一个简单的程序,演示stringutils.isnumeric()的用法
测试等级:
输出:
我假设所有的数字
Integer
. 否则使用Double.parseDouble()
.