这是我第一次使用hadoop,我在写入输出文件时遇到了问题。当我用system.out打印值时,它显示得很好,但是使用context.write(key,value)将值打印为nan。
例子:
System.out.println(stockName.toString() + " " + result.toString());
正确输出到用户日志:
AAPL.csv 0.076543
但使用:
context.write(stockName, result);
输出:
AAPL.csv NaN
result和stockname都是以前设置的text()对象。
我还包括了我的整个reduce函数。任何想法都会很好,因为我已经尝试了我能想到的一切,谢谢!
public static class Reduce extends Reducer<Text, Text, Text, Text> {
private Text stockName = new Text();
private ArrayList<Float> monthlyReturn = new ArrayList<Float>();
private String previousMonth = "";
private float numOfMonths = 0;
private float startPrice = 0;
private float endPrice = 0;
private Text result = new Text();
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
// Set the Stock Name as the Key
stockName.set(key);
for (Text val: values) {
System.out.println(val);
// Parse date & adjusted close
String[] stockValues = val.toString().split(",");
if (stockValues.length < 2) {
continue;
}
String month = stockValues[0];
String priceInput = stockValues[1];
float closingPrice = Float.parseFloat(priceInput);
// First time around setup.
if (startPrice == 0 && previousMonth.equals("")) {
startPrice = closingPrice;
previousMonth = month;
}
/*
* We check if the month has changed, and that we're not just starting.
* If the month changed, increment the number of months we have seen, and run a calculation
* for monthly return.
*
* closePrice is set to every stock value. The startPrice is only set when the month changes.
* When the month does change, we take the last set closePrice to run our calculation, and
* then set the new startPrice.
*/
if (!month.equals(previousMonth) && endPrice != 0) {
numOfMonths += 1;
monthlyReturn.add((endPrice - startPrice)/startPrice);
startPrice = closingPrice;
}
previousMonth = month;
endPrice = closingPrice;
}
// Add on the last month value
numOfMonths += 1;
monthlyReturn.add((endPrice - startPrice)/startPrice);
/*
* Generate the volatility. The equation is as follows:
*
* 1. xbar = sum(xi)/numOfMonth -> sum is over all values from 0 to N in monthlyReturn
* 2. xsum = sum( (xi-xbar)^2 ) from 0 to N in monthlyReturn
* 3. volatility = sqrt( (1/numOfMonth-1)*xsum )
*/
// 1.
float xiSum = 0;
for (int i =0; i<monthlyReturn.size(); i++) {
xiSum += monthlyReturn.get(i);
}
float xBar = xiSum/numOfMonths;
// 2.
double xSum = 0;
for (int i=0; i<monthlyReturn.size(); i++) {
xSum += Math.pow(monthlyReturn.get(i) - xBar, 2);
}
// 3.
double root = (1/(numOfMonths-1))*xSum;
result.set(String.valueOf(Math.sqrt(root)));
System.out.println(stockName.toString() + " " + result.toString());
context.write(stockName, result);
}
}
public static void main(String[] args) throws Exception {
Job job = Job.getInstance();
job.setJarByClass(StockVolatility.class);
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
1条答案
按热度按时间qacovj5a1#
不要使用job.setcombinerclass(reduce.class);我做了那件事之后,我的问题就解决了。