在spark emr上运行collab筛选代码时出现作业中止问题

vcudknz3  于 2021-05-18  发布在  Spark
关注(0)|答案(0)|浏览(211)

数据集详细信息:
行数:296211715
唯一用户数:6988040
群集详细信息:
尺寸:m5.8XL
硕士:1
核心:8
代码:


# creating numeric ids instead of the existing string ids

indexer_user = StringIndexer(inputCol="userId", outputCol="userId_num") 
indexer_movie = StringIndexer(inputCol="movieId", outputCol="movieId_num") 

aekt_collab_rename_indexed = indexer_user.fit(aekt_collab_rename).transform(aekt_collab_rename) 
aekt_collab_rename_indexed = indexer_movie.fit(aekt_collab_rename_indexed).transform(aekt_collab_rename_indexed) 

(training, test) = aekt_collab_rename_indexed.randomSplit([0.8,0.2])

als = ALS(maxIter=5, regParam=0.01, userCol="userId_num", itemCol="movieId_num", ratingCol="rating",
          coldStartStrategy="drop")
model = als.fit(training)

# Evaluate the model by computing the RMSE on the test data

predictions = model.transform(test)
evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating",
                                predictionCol="prediction")
rmse = evaluator.evaluate(predictions)
print("Root-mean-square error = " + str(rmse))

# Generate top 10 movie recommendations for each user

userRecs = model.recommendForAllUsers(10)

userRecs.show(10)

现在,所有代码都按预期运行,直到我尝试显示userrecs数据。错误如下:

py4j.protocol.Py4JJavaError: An error occurred while calling o277.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 13 in stage 75.0 failed 4 times, most recent failure: Lost task 13.3 in stage 75.0 (TID 4641, ip-172-31-10-178.us-west-2.compute.internal, executor 180): ExecutorLostFailure (executor 180 exited caused by one of the running tasks) Reason: Container from a bad node: container_1604516516252_0004_01_000235 on host: ip-172-31-10-178.us-west-2.compute.internal. Exit status: 137. Diagnostics: [2020-11-04 21:04:08.061]Container killed on request. Exit code is 137
[2020-11-04 21:04:08.061]Container exited with a non-zero exit code 137. 
[2020-11-04 21:04:08.061]Killed by external signal

这是我的集群设置的问题吗?任何帮助都将不胜感激。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题