javasparkmllib:有一个错误“error owlqn:failure！重置历史：breeze.optimize.nanhistory:“用于ml库中的logistic回归

zfycwa2u 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(436)

我刚刚尝试使用apachesparkml库进行logistic回归，但每次尝试时都会出现一条错误消息，例如
“错误：失败！重置历史记录：breeze.optimize.nanhistory:“
logistic回归的数据集示例如下：

+-----+---------+---------+---------+--------+-------------+
|state|dayOfWeek|hourOfDay|minOfHour|secOfMin|     features|
+-----+---------+---------+---------+--------+-------------+
|  1.0|      7.0|      0.0|      0.0|     0.0|(4,[0],[7.0])|

逻辑回归的代码如下：

//Data Set
StructType schema = new StructType(
new StructField[]{
    new StructField("state", DataTypes.DoubleType, false, Metadata.empty()),
    new StructField("dayOfWeek", DataTypes.DoubleType, false, Metadata.empty()),
    new StructField("hourOfDay", DataTypes.DoubleType, false, Metadata.empty()),
    new StructField("minOfHour", DataTypes.DoubleType, false, Metadata.empty()),
    new StructField("secOfMin", DataTypes.DoubleType, false, Metadata.empty())
});
List<Row> dataFromRDD = bucketsForMLs.map(p -> {
    return RowFactory.create(p.label(), p.features().apply(0), p.features().apply(1), p.features().apply(2), p.features().apply(3));
}).collect();

Dataset<Row> stateDF = sparkSession.createDataFrame(dataFromRDD, schema);
String[] featureCols = new String[]{"dayOfWeek", "hourOfDay", "minOfHour", "secOfMin"};
VectorAssembler vectorAssembler = new VectorAssembler().setInputCols(featureCols).setOutputCol("features");
Dataset<Row> stateDFWithFeatures = vectorAssembler.transform(stateDF);

StringIndexer labelIndexer = new StringIndexer().setInputCol("state").setOutputCol("label");
Dataset<Row> stateDFWithLabelAndFeatures = labelIndexer.fit(stateDFWithFeatures).transform(stateDFWithFeatures);

MLRExecutionForDF mlrExe = new MLRExecutionForDF(javaSparkContext);
mlrExe.execute(stateDFWithLabelAndFeatures);

// Logistic Regression part
LogisticRegressionModel lrModel = new LogisticRegression().setMaxIter(maxItr).setRegParam(regParam).setElasticNetParam(elasticNetParam)  
// This part would occur error
.fit(stateDFWithLabelAndFeatures);

Java hadoop apache-spark apache-spark-ml logistic-regression

来源：https://stackoverflow.com/questions/45381403/java-spark-mllib-there-is-an-error-error-owlqn-failure-resetting-history-br

1条答案

按热度按时间

68bkxrlz1#

我也遇到了同样的错误。它来自breeze scalanlp软件包，spark刚刚进口。它说一些关于衍生产品的东西是不能产生的。
我不确定这到底意味着什么，但在我的数据集中，我可以说我使用的数据越少，抛出错误的频率就越高。这意味着，对于要训练的类来说，缺失特征的比例越高，错误发生的频率就越高。我认为这与由于缺少类的信息而无法正确优化有关。
否则，该错误似乎不会阻止代码运行。

赞(0）回复(0）举报 2021-05-29

我来回答

javasparkmllib:有一个错误“error owlqn:failure！重置历史：breeze.optimize.nanhistory:“用于ml库中的logistic回归

1条答案

相关问题

热门标签

最新问答