org.apache.spark.sql.DataFrame.randomSplit()方法的使用及代码示例

x33g5p2x  于2022-01-18 转载在 其他  
字(1.3k)|赞(0)|评价(0)|浏览(286)

本文整理了Java中org.apache.spark.sql.DataFrame.randomSplit()方法的一些代码示例,展示了DataFrame.randomSplit()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。DataFrame.randomSplit()方法的具体详情如下:
包路径:org.apache.spark.sql.DataFrame
类名称:DataFrame
方法名:randomSplit

DataFrame.randomSplit介绍

暂无

代码示例

代码示例来源:origin: phuonglh/vn.vitk

void testRandomSplit(String inputFileName, int numFeatures, String modelFileName) {
  CMMParams params = new CMMParams()
    .setMaxIter(600)
    .setRegParam(1E-6)
    .setMarkovOrder(2)
    .setNumFeatures(numFeatures);
  
  JavaRDD<String> lines = jsc.textFile(inputFileName);
  DataFrame dataset = createDataFrame(lines.collect());
  DataFrame[] splits = dataset.randomSplit(new double[]{0.9, 0.1}); 
  DataFrame trainingData = splits[0];
  System.out.println("Number of training sequences = " + trainingData.count());
  DataFrame testData = splits[1];
  System.out.println("Number of test sequences = " + testData.count());
  // train and save a model on the training data
  cmmModel = train(trainingData, modelFileName, params);
  // test the model on the test data
  System.out.println("Test accuracy:");
  evaluate(testData); 
  // test the model on the training data
  System.out.println("Training accuracy:");
  evaluate(trainingData);
}

代码示例来源:origin: psal/jstylo

DataFrame[] splits = df.randomSplit(splitArray,randSeed);
for (int i = 0; i<folds; i++){

相关文章